What is data visualization ?

**Answer**

Data visualization basically refers to the graphical or visual representation of information and data using visual elements like charts, graphs, maps and so forth.

Name the Python library generally used for data visualization.

**Answer**

The Python library generally used for data visualization is Matplotlib library.

Is Pyplot a Python library ? What is it ?

**Answer**

Yes, Pyplot is a Python library. Pyplot is a collection of methods within matplotlib library which allows user to construct 2D plots easily and interactively.

Name the function you will use to create a horizontal bar chart.

**Answer**

The `barh()`

function is used to create a horizontal bar chart.

Which argument will you provide to change the following in a line chart ?

(i) width of the line

(ii) color of the line

**Answer**

(i) For changing the width of the line, we use the linewidth argument with the plot() function as: `<matplotlib.pyplot>.plot(<data1>, [,data2], linewidth = <width> )`

(ii) For changing the color of the line, we use the color argument with the plot() function as: `<matplotlib.pyplot>.plot(<data1>, [,data2], <color code>)`

What is a marker ? How can you change the marker type and color in a plot ?

**Answer**

The data points being plotted on a graph/chart are called markers. To change the marker type and color, we use following additional optional arguments in plot function : `marker = <valid marker type>, markeredgecolor = <valid color>`

.

Using which function of Pyplot can you plot histograms ?

**Answer**

With Pyplot, a histogram is created using `hist()`

function.

Are bar charts and histograms the same ?

**Answer**

No, bar charts and histograms are not same. A bar chart or bar graph is a chart or graph that presents categorical data with rectangular bars with heights or lengths proportional to the values that they represent. On the other hand, a histogram is a type of graph that provides a visual interpretation of numerical data by indicating the number of data points that lie within a range of values.

Name various types of histogram plots that you can create using Pyplot.

**Answer**

The types of histogram plots that we can create using Pyplot are cumulative histogram, step type histogram, stacked bar type histogram, horizontal histogram.

What is a frequency polygon ?

**Answer**

A frequency polygon is a type of frequency distribution graph. In a frequency polygon, the number of observations is marked with a single point at the midpoint of an interval.

What is the use of box plot ?

**Answer**

The box plot is used to show the range and the middle half of the ranked data.

Using which function of Pyplot, can you create box plots ?

**Answer**

Pyplot module's `boxplot()`

function allows to create box plots.

Which library is imported to draw charts in Python ?

- csv
- matplotlib
- numpy
- pandas

**Answer**

matplotlib

**Reason** — The library imported to draw charts in Python is Matplotlib.

PyPlot is an interface of Python's ............... library.

- seaborn
- plotly
- ggplot
- matplotlib

**Answer**

matplotlib

**Reason** — PyPlot is an interface of Python's Matplotlib library that allows users to construct 2D plots easily and interactively.

For 2D plotting using a Python library, which library interface is often used ?

- seaborn
- plotly
- matplotlib
- matplotlib.pyplot

**Answer**

matplotlib.pyplot

**Reason** — The `matplotlib.pyplot`

interface is commonly used for 2D plotting in Python using the Matplotlib library.

Which of the following is not a valid chart type ?

- histogram
- statistical
- pie
- box

**Answer**

statistical, box

**Reason** — Pie charts and histograms are valid chart types used for data visualization. Statistical and box plots are not valid chart types.

Which of the following is not a valid plotting function of Pyplot ?

- plot()
- bar()
- line()
- pie()

**Answer**

line()

**Reason** — The `line()`

function is not a valid plotting function of Pyplot. In Matplotlib's Pyplot module, the correct function for creating line plots is `plot()`

.

Which of the following plotting functions does not plot multiple data series ?

- plot()
- bar()
- pie()
- barh()

**Answer**

pie()

**Reason** — The `pie()`

function in Matplotlib's Pyplot module can plot only one data sequence. On the other hand, functions like `plot()`

, `bar()`

, and `barh()`

can plot multiple data series in a single chart.

The plot which tells the trend between two graphed variables is the ............... graph/chart.

- line
- scatter
- bar
- pie

**Answer**

line

**Reason** — A line chart or line graph is a type of chart which displays information as a series of data points called "markers" connected by straight line segments.

The plot which tells the correlation between two variables which may not be directly related is ............... graph/chart.

- line
- scatter
- bar
- pie

**Answer**

scatter

**Reason** — A scatter chart is a graph of plotted points on two axes that show the relationship between two sets of data.

A ............... is a summarization tool for discrete or continuous data.

- quartile
- histogram
- mean
- median

**Answer**

histogram

**Reason** — A histogram is a summarization tool for discrete or continuous data. A histogram provides a visual interpretation of numerical data by indicating the number of data points that lie within a range of values.

A visual representation of the statistical five number summary of a given dataset is known as ............... .

- histogram
- frequency distribution
- box plot
- frequency polygon

**Answer**

box plot

**Reason** — A box plot provides a visual representation of the statistical five-number summary of a given dataset. It includes the highest and lowest numbers, the median, and the upper and lower quartiles.

Which of the following functions is used to create a line chart ?

- line()
- plot()
- chart()
- plotline()

**Answer**

plot()

**Reason** — The `plot()`

function is used to create a line chart or line graph.

Which of the following function will produce a bar chart ?

- plot()
- bar()
- plotbar()
- barh()

**Answer**

bar(), barh()

**Reason** — With Pyplot, a bar chart is created using `bar()`

and `barh()`

functions.

Which of the following function will create a vertical bar chart ?

- plot()
- bar()
- plotbar()
- barh()

**Answer**

bar()

**Reason** — The `bar()`

function will create a vertical bar chart.

Which of the following function will create a horizontal bar chart ?

- plot()
- bar()
- plotbar()
- barh()

**Answer**

barh()

**Reason** — The `barh()`

function will create a horizontal bar chart.

To specify the style of line as dashed, which argument of plot() needs to be set ?

- line
- width
- style
- linestyle

**Answer**

linestyle

**Reason** — To specify the style of the line as dashed in Matplotlib's `plot()`

function, we need to set the `linestyle`

argument to 'dashed'.

The data points plotted on a graph are called ............... .

- points
- pointers
- marks
- markers

**Answer**

markers

**Reason** — The data points being plotted on a graph/chart are called markers.

A ............... graph is a type of chart which displays information as a series of data points connected by straight line segments.

- line
- bar
- pie
- box plot

**Answer**

line

**Reason** — A line chart or line graph is a type of chart which displays information as a series of data points called "markers" connected by straight line segments.

To create scatter charts using plot(), which argument is skipped ?

- marker
- linestyle
- markeredgecolor
- linewidth

**Answer**

linestyle

**Reason** — When creating scatter charts using Matplotlib's `plot()`

function, the `linestyle`

argument is skipped because scatter plots do not use line styles.

In scatter(), which argument is used to specify the size of data points ?

- size
- s
- marker
- markersize

**Answer**

s

**Reason** — The `s`

argument in `scatter()`

is used to specify the size of data points.

Which argument of bar() lets you set the thickness of bar ?

- thick
- thickness
- width
- barwidth

**Answer**

width

**Reason** — The `width`

argument allows to control the thickness of the bars in a bar chart created using the `bar()`

function in Matplotlib.

To change the width of bars in a bar chart, which of the following arguments with a float value is used ?

- hwidth
- width
- breath
- barwidth

**Answer**

width

**Reason** — The `width`

argument with a float value is used to change the width of bars in a bar chart created using the `bar()`

function in Matplotlib.

Which function lets you set the title of the plot ?

- title()
- plottitle()
- graphtitle()
- all of these

**Answer**

title()

**Reason** — The `title()`

function sets the title of the plot in Matplotlib.

The command used to give a heading to a graph is ............... .

- plt.show()
- plt.plot()
- plt.xlabel()
- plt.title()

**Answer**

plt.title()

**Reason** — The `plt.title()`

command is used in Matplotlib's Pyplot module to give a heading or title to a graph.

Which function would you use to set the limits for x-axis of the plot ?

- limits()
- xlimits()
- xlim()
- lim()

**Answer**

xlim()

**Reason** — The `xlim()`

function is used to set the limits for the x-axis of a plot in Matplotlib.

Which function is used to show legends ?

- display()
- show()
- legend()
- legends()

**Answer**

legend()

**Reason** — The `legend()`

function in Matplotlib's Pyplot module is used to show legends in a plot.

Which argument must be set with plotting functions for legend() to display the legends ?

- data
- label
- name
- sequence

**Answer**

label

**Reason** — The `label`

argument must be set with plotting functions for the `legend()`

function to display the legends correctly.

Which function is used to create a histogram ?

- histo()
- histogram()
- hist()
- histtype

**Answer**

hist()

**Reason** — The `hist()`

function is used to create a histogram.

Which argument in hist() is used to create a stacked bar type histogram ?

- histt
- histtype
- type
- barstacked

**Answer**

histtype

**Reason** — The `histtype`

argument in Matplotlib's `hist()`

function is used to create a stacked bar type histogram. Setting `histtype = 'barstacked'`

creates a histogram where bars for each bin are stacked on top of each other, representing different categories or subgroups within the data.

Which of the following functions can plot only one data series ?

- plot()
- bar()
- boxplot()
- pie()

**Answer**

pie()

**Reason** — The `pie()`

function in Matplotlib's Pyplot module can plot only one data series. On the other hand, functions like `plot()`

, `bar()`

, and `boxplot()`

can plot multiple data series in a single chart.

Which argument must be provided to create wedges out of a pie chart ?

- label
- autopct
- explode
- wedge

**Answer**

explode

**Reason** — The `explode`

argument is used in pie charts to visually separate one or more wedges from the rest of the pie chart.

Which argument should be set to display percentage share of each pie on a pie chart ?

- label
- autopct
- explode
- wedge

**Answer**

autopct

**Reason** — To view the percentage of share in a pie chart, we need to add an argument `autopct`

with a format string, such as "%1.1F%%".

Which function creates a box plot ?

- box()
- plot()
- boxplot()
- showbox()

**Answer**

boxplot()

**Reason** — The `boxplot()`

function is used to create box plots in Matplotlib.

Which argument of boxplot() is used to create a filled boxplot ?

- fill
- box
- patch_artist
- patch

**Answer**

patch_artist

**Reason** — The `patch_artist`

argument in the `boxplot()`

function is used to create a filled box plot. When set to True, it fills the boxes of the box plot with a color, making them more visually distinct.

A * histogram* is a plot that shows the underlying frequency distribution of a set of continuous data.

Pyplot interface is a collection of methods within * matplotlib* library of Python.

Pyplot's * plot()* function is used to create line charts.

Pyplot's * barh()* function is used to create horizontal bar charts.

Pyplot's * scatter()* function is used to create scatter charts.

Pyplot's * hist()* function is used to create histogram.

The datapoints plotted on a graph are called * markers* .

The * linewidth* argument of plot() specifies the width for the line.

The * linestyle* argument of plot() specifies the style of the line.

The * width* argument of bar() specifies the bar width.

The * xticks()* function is used to specify ticks for x-axis.

To save a plot, * savefig()* function is used.

The * orientation* argument of hist() is set to create a horizontal histogram.

The * showmeans* argument shows the arithmetic mean on a boxplot.

The * notch* argument in a boxplot() creates a notched boxplot.

The * loc* argument of legend() provides the location of legend.

Using Python Matplotlib * histogram* can be used to count how many values fall into each interval. (line plot / bar graph / histogram)

PyPlot is a sub-library of matplotlib library.

**Answer**

True

**Reason** — PyPlot is a sub-library of Matplotlib library. It allows users to construct 2D plots easily and interactively. Pyplot essentially reproduces plotting functions and behavior of MATLAB.

Statement import pyplot.matplotlib is a valid statement for working on pyplot functions.

**Answer**

False

**Reason** — The statement `import matplotlib.pyplot`

is the valid and commonly used way to import the PyPlot submodule from the Matplotlib library.

By default, pie chart is printed in elliptical or oval shape.

**Answer**

True

**Reason** — By default, a pie chart is printed in an elliptical or oval shape.

The default shape of pie chart cannot be changed from oval.

**Answer**

False

**Reason** — By default, a pie chart is printed in an elliptical or oval shape. It can be changed to a circle by using the `axis()`

function of Pyplot and passing the 'equal' argument to it.

A line chart can be plotted using pyplot library's line() function.

**Answer**

False

**Reason** — In Matplotlib's Pyplot library, the function used to plot a line chart is `plot()`

.

A line chart can be plotted using pyplot library's plot() function.

**Answer**

True

**Reason** — In Matplotlib's Pyplot library, the function used to plot a line chart is `plot()`

.

A bar chart can be plotted using pyplot library's bar() function.

**Answer**

True

**Reason** — In Matplotlib's Pyplot library, the function used to plot a bar chart is `bar()`

.

A bar chart can be plotted using pyplot library's barh() function.

**Answer**

True

**Reason** — The `barh()`

function in Matplotlib's Pyplot library is used to create horizontal bar charts.

It is not possible to plot multiple series of values in the same bar graph.

**Answer**

False

**Reason** — It is possible to plot multiple series of values in the same bar graph using Matplotlib's Pyplot library because the `bar()`

and `barh()`

functions support handling multiple datasets.

A standard marker of representing a non-number data in Python libraries is NaN.

**Answer**

True

**Reason** — A standard marker for representing missing or non-number data in Python libraries is NaN (Not a Number).

If the linestyle argument is missing along with markerstyle-string in a plot(), a scatter type chart get created.

**Answer**

True

**Reason** — When both the `linestyle`

argument and the `marker`

argument (markerstyle-string) are not specified in the `plot()`

function, the resulting chart can resemble a scatter plot. In this case, the points will be plotted without connected lines, similar to how a scatter plot displays data points.

The bar() function can also create horizontal bar charts.

**Answer**

False

**Reason** — The `bar()`

function in Matplotlib's Pyplot library can create vertical bar charts, while the `barh()`

function creates horizontal bar charts.

The pie() function can plot multiple data series.

**Answer**

False

**Reason** — The `pie()`

function in Matplotlib's Pyplot module can plot only one data series.

The plot is always as per the data series being plotted irrespective of the xlim().

**Answer**

False

**Reason** — The plot appearance can be affected by the data series being plotted, but it can also be influenced by functions such as `xlim()`

which determine the range of values shown on the x-axis.

Frequency polygon is created from histogram.

**Answer**

True

**Reason** — When a frequency polygon is drawn manually, it is based on the data that would be used to create a histogram.

What is not true about Data Visualization ?

(a) Graphical representation of information and data.

(b) Helps users in analyzing a large amount of data in a simpler way.

(c) Data Visualization makes complex data more accessible, understandable, and usable.

(d) No library needs to be imported to create charts in Python language.

**Answer**

No library needs to be imported to create charts in Python language.

**Reason** — To create charts and visualizations in Python, we need to import libraries such as Matplotlib.

**Assertion.** The matplotlib library of Python is used for data visualization.

**Reason.** The PyPlot interface of matplotlib library is used for 2D plotting.

- Both A and R are true and R is the correct explanation of A.
- Both A and R are true but R is not the correct explanation of A.
- A is true but R is false.
- A is false but R is true.

**Answer**

Both A and R are true and R is the correct explanation of A.

**Explanation**

The Matplotlib library is used for data visualization in Python, providing a variety of tools and functionalities for creating different types of plots, charts, and graphs. The Pyplot module, which is a collection of methods within the Matplotlib library, allows users to construct 2D plots easily and interactively.

**Assertion.** A scatter chart simply plots the data points on a chart to show the trend in the data.

**Reason.** A line chart connects the plotted data points with a line.

- Both A and R are true and R is the correct explanation of A.
- Both A and R are true but R is not the correct explanation of A.
- A is true but R is false.
- A is false but R is true.

**Answer**

Both A and R are true but R is not the correct explanation of A.

**Explanation**

The scatter chart is a graph of plotted points on two axes that show the relationship between two sets of data. On the other hand, a line chart, or line graph, is a type of chart that displays information as a series of data points called 'markers' connected by straight line segments.

**Assertion.** Both scatter() and plot() functions of PyPlot can create scatter charts.

**Reason.** The plot() function can create line charts as well as scatter charts.

- Both A and R are true and R is the correct explanation of A.
- Both A and R are true but R is not the correct explanation of A.
- A is true but R is false.
- A is false but R is true.

**Answer**

Both A and R are true and R is the correct explanation of A.

**Explanation**

Both the `scatter()`

and `plot()`

functions in Pyplot can create scatter charts. The `plot()`

function in Pyplot can create both line charts and scatter charts. When specifying marker styles without providing a linestyle argument, the `plot()`

function will create a scatter chart.

**Assertion.** For the same sets of data, you can create various charts using plot(), scatter(), pie(), bar() and barh().

**Reason.** All the data sets of a plot(), scatter(), bar() cannot be used by pie() ; it will work with only a single set of data.

- Both A and R are true and R is the correct explanation of A.
- Both A and R are true but R is not the correct explanation of A.
- A is true but R is false.
- A is false but R is true.

**Answer**

A is false but R is true.

**Explanation**

We can create various charts using `plot()`

, `scatter()`

, `bar()`

, and `barh()`

for the same datasets, but not using `pie()`

. The `pie()`

function specifically works with a single set of data, whereas the other functions can handle multiple datasets or series.

**Assertion.** Five-point statistical summary of a data set can be visually represented.

**Reason.** The boxplot() function can plot the highest and lowest numbers of a data range, its median along with the upper and lower quartiles.

- Both A and R are true and R is the correct explanation of A.
- Both A and R are true but R is not the correct explanation of A.
- A is true but R is false.
- A is false but R is true.

**Answer**

Both A and R are true and R is the correct explanation of A.

**Explanation**

The five-point statistical summary of a dataset can be visually represented through a box plot. A box plot is used to display the range and middle half of ranked data. It uses five important numbers from the data range: the extremes (highest and lowest numbers), the median, and the upper and lower quartiles, comprising the five-number statistical summary.

**Assertion.** Line graph is a tool for comparison and is created by plotting a series of several points and connecting them with a straight line.

**Reason.** You should never use a line chart when the chart is in a continuous data set.

- Both A and R are true and R is the correct explanation of A.
- Both A and R are true but R is not the correct explanation of A.
- A is true but R is false.
- A is false but R is true.

**Answer**

A is true but R is false.

**Explanation**

A line graph is a tool for comparison, created by plotting a series of data points called 'markers' and connecting them with straight lines. This makes it easier to compare different data points and observe patterns. Line charts are suitable for continuous data sets, displaying information as a series of data and not restricted to discontinuous data sets.

Name the library of which the PyPlot is an interface.

**Answer**

PyPlot is an interface provided by the Matplotlib library.

Write the statement to import PyPlot in your script.

**Answer**

The statement to import PyPlot in our script is as follows:

`import matplotlib.pyplot as plt`

Name the functions to create the following :

(a) line chart

(b) bar chart

(c) horizontal bar chart

(d) histogram

(e) scatter chart

(j) boxplot

(g) pie chart

**Answer**

(a) line chart: `plot()`

function

(b) bar chart: `bar()`

function

(c) horizontal bar chart: `barh()`

function

(d) histogram: `hist()`

function

(e) scatter chart: `scatter()`

function

(j) boxplot: `boxplot()`

function

(g) pie chart: `pie()`

function

What is a line chart ?

**Answer**

A line chart, or line graph, is a type of chart that displays information as a series of data points called 'markers' connected by straight line segments.

What is a scatter chart ? How is it different from line chart ?

**Answer**

The scatter chart is a graph of plotted points on two axes that show the relationship between two sets of data. With a scatter plot, a mark or marker (usually a dot or small circle), represents a single data point. With one mark (point) for every data point a visual distribution of the data can be seen. Depending on how tightly the points cluster together, we may be able to discern a clear trend in the data.

The difference is that with a scatter plot, the decision is made from the data points such that the individual points should not be connected directly together with a line but, instead express a trend.

What is the utility of pie chart ?

**Answer**

A pie chart is used to show parts in relation to the whole, often representing percentage shares and numerical proportions.

What is a bar chart ? How is it useful as compared to the line chart ?

**Answer**

A bar graph / bar chart is a graphical display of data using bars of different heights.

Compared to a line chart, which connects data points with lines, a bar chart is useful for comparing discrete categories rather than showing continuous trends over time. Bar charts are effective for highlighting differences in values between categories and are particularly useful when dealing with categorical data or comparing data across different groups or time periods.

What is a histogram ? What is its usage/utility ?

**Answer**

A histogram is a summarization tool for discrete or continuous data. A histogram provides a visual interpretation of numerical data by indicating the number of data points that lie within a range of values (called "bins"). Histograms are a great way to show results of continuous data, such as: weight, height, how much time, and so forth.

What is a boxplot ? Which situations are more appropriate for boxplot ?

**Answer**

A boxplot is a graphical representation of the distribution of a dataset through five summary statistics: the extremes (the highest and the lowest numbers), the median, and the upper and lower quartiles.

Box plots are suitable for visualizing the spread of data, identifying outliers, comparing data distribution between different groups or categories, and assessing symmetry in a dataset.

What is a frequency polygon ? What is it utility ?

**Answer**

A frequency polygon is a type of frequency distribution graph. In a frequency polygon, the number of observations is marked with a single point at the midpoint of an interval. A straight line then connects each set of points. Frequency polygons make it easy to compare two or more distributions on the same set of axes.

Name the function to label axes.

**Answer**

The functions to label axes in a plot using Matplotlib's Pyplot library are `xlabel()`

for the x-axis and `ylabel()`

for the y-axis.

Name the function to give title to a plot.

**Answer**

The function `title()`

in Matplotlib's Pyplot library is used to add a title to a plot.

Name the function to set figure size of a plot.

**Answer**

The `figure()`

function in Matplotlib's Pyplot library is used to set figure size of a plot.

Name the function to set limits for the axes.

**Answer**

The function to set limits for the axes in a plot using Matplotlib's Pyplot library is `xlim()`

for the x-axis and `ylim()`

for the y-axis.

Name the function to show legends on a plot.

**Answer**

The `legend()`

function in Matplotlib's Pyplot library is used to display a legend on the plot.

Name the function to add ticks on axes.

**Answer**

The functions to add ticks on axes in a plot using Matplotlib's Pyplot library are `xticks()`

for the x-axis and `yticks()`

for the y-axis.

What is the significance of data visualization ?

**Answer**

Patterns, trends and correlations that might go undetected in text-based data can be exposed and recognized easier with data visualization techniques or tools such as line chart, bar chart, pie chart, histogram, scatter chart etc. Thus with data visualization tools, information can be processed in efficient manner and hence better decisions can be made.

How does Python support data visualization ?

**Answer**

Python supports data visualizations by providing some useful libraries for visualization. Most commonly used data visaulization library is matplotlib. Matplotlib is a Python library, also sometimes known as the plotting library. The matplotlib library offers very extensive range of 2D plot types and output formats. It offers complete 2D support along with limited 3D graphic support. It is useful in producing publication quality figures in interactive environment across platforms. It can also be used for animations as well. There are many other libraries of Python that can be used for data Visualization but matplotlib is very popular for 2D plotting.

What is the use of matplotlib and pyplot ?

**Answer**

For data visualization in Python, the matplotlib library's Pyplot interface is used. Matplotlib is a Python library that provides interfaces and functionalities for 2D graphics, similar to MATLAB's in various forms. It offers both a quick way to visualize data in Python and creates publication-quality figures in many formats. The Matplotlib library offers various named collections of methods. Pyplot, as one such interface, enables users to construct 2D plots easily and interactively.

What are the popular ways of plotting data ?

**Answer**

The popular ways of plotting data include line charts, bar charts, histograms, scatter plots, pie charts, box plots.

Compare bar() and barh() functions.

**Answer**

bar() function | barh() function |
---|---|

This function is used to create vertical bar charts. | This function is used to create horizontal bar charts. |

In a vertical bar chart, the bars are plotted along the vertical axis (y-axis) with their lengths representing the values being plotted. | In a horizontal bar chart, the bars are plotted along the horizontal axis (x-axis) with their lengths representing the values being plotted. |

The first sequence given in the bar() forms the x-axis and the second sequence values are plotted on y-axis. | The first sequence given in the barh() forms the y-axis and the second sequence values are plotted on x-axis. |

What is the role of legends in a graph/chart ?

**Answer**

In a chart/graph, there may be multiple datasets plotted. To distinguish among various datasets plotted in the same chart, legends are used. Legends can be different colors/patterns assigned to different specific datasets. The legends are shown in a corner of a chart/graph.

What will happen if you use legend() without providing any label for the data series being plotted ?

**Answer**

Using `legend()`

function without labels results in default labels (e.g., "line 1," "line 2"). This can confuse viewers as it lacks meaningful information about the data series being plotted.

What do you understand by xlimit and ylimit ? How are these linked to data being plotted ?

**Answer**

The xlimit and ylimit determine which data values are visible on the x-axis and y-axis in a plot or chart respectively. Only the data values that fall within these limits will be plotted. If no data value maps to the specified x-limits or y-limits, nothing will show on the plot for that particular axis range.

When should you use

(i) a line chart

(ii) a bar chart

(iii) a scatter chart

(iv) pie chart

(v) boxplot ?

**Answer**

(i) **Line Chart** — Use a line chart to show trends or changes over time. It's suitable for displaying continuous data series and highlighting patterns or fluctuations.

(ii) **Bar Chart** — Use a bar chart to compare categories or groups. It's effective for displaying discrete data and showing differences or relationships between items.

(iii) **Scatter Chart** — Use a scatter chart to visualize relationships between two variables. It's helpful for identifying correlations or trends in data points.

(iv) **Pie Chart** — Use a pie chart to represent parts of a whole. It's useful for showing the proportion or distribution of different categories within a dataset.

(v) **Boxplot** — The box plot is used to show the range and the middle half of ranked data while identifying outliers or variability.

A list namely temp contains average temperatures for seven days of last week. You want to see how the temperature changed in last seven days. Which chart type will you plot for the same and why ?

**Answer**

A line chart is the suitable choice for visualizing how the temperature changed over the last seven days. The line chart shows trends over time and displays continuous data, making it ideal for representing temperature values. The chart's ability to connect data points allows viewers to easily observe temperature trends and understand variations across the seven-day period.

What is histogram ? How do you create histograms in Python ?

**Answer**

A histogram is a summarization tool for discrete or continuous data, providing a visual interpretation of numerical data by showing the number of data points that fall within a specified range of values.

The `hist()`

function of the Pyplot module is used to create and plot a histogram from a given sequence of numbers. The syntax for using the `hist()`

function in Pyplot is as follows:

`matplotlib.pyplot.hist(x, bins = None, cumulative = False, histtype = 'bar', align = 'mid', orientation = 'vertical', )`

.

What are various types of histograms that can be created through hist() function ?

**Answer**

The `hist()`

function in Matplotlib's Pyplot module allows creating various types of histograms. These include the default bar histogram (histtype='bar'), step histogram (histtype='step'), stepfilled histogram (histtype='stepfilled'), barstacked histogram (histtype='barstacked').

When should you create histograms and when should you create bar charts to present data visually ?

**Answer**

Histograms are great for displaying specific ranges of values and are ideal for visualizing the results of continuous data, such as the ages of students in a class. Bar charts, on the other hand, are effective for comparing categorical or discrete data across different categories or groups, such as comparing the sales performance of different products.

What is cumulative histogram ? How do you create it using PyPlot ?

**Answer**

A cumulative histogram is a graphical representation in which each bin displays the count of data points within that bin as well as the counts of all smaller bins. The final bin in this histogram indicates the total number of data points in the dataset.

In Matplotlib's hist function, we can create a cumulative histogram by setting the `cumulative`

parameter to True. The syntax is as follows: `matplotlib.pyplot.hist(x, bins = None, histtype='barstacked', cumulative=True)`

.

What is frequency polygon ? How do you create it ?

**Answer**

A frequency polygon is a type of frequency distribution graph. In a frequency polygon, the number of observations is marked with a single point at the midpoint of an interval, and a straight line then connects each set of points.

We can create frequency polygon in following two ways:

- Drawing Frequency Polygon Manually
- Creating Frequency Polygon through a Line Chart

What is 5 point summary ?

**Answer**

The five-point summary is a descriptive statistics tool that provides a concise summary of the distribution of a dataset. It consists of five important numbers of a data range:

- the minimum range value
- the maximum range value
- the upper quartile
- the lower quartile
- the median

What is Boxplot ? How do you create it in Pyplot ?

**Answer**

A boxplot is a visual representation of the statistical five number summary of a given data set, including the extremes (the highest and the lowest numbers), the median, the upper and lower quartiles.

With Pyplot, a boxplot is created using boxplot() function. The syntax is as follows : `matplotlib.pyplot.boxplot(x, notch = None, vert = None, meanline = None, showmeans = None, showbox = None,)`

.

Execute the following codes and find out what happens ? (Libraries have been imported already ; plt is the alias name for matplotlib.pyplot)

```
A = np.arange(2, 20, 2)
B = np.log(A)
plt.plot(A, B)
```

Will this code produce error ? Why/Why not ?

**Answer**

Executing the provided code will not produce an error. It will generate a plot of the logarithm of A against A itself.

The line `A = np.arange(2, 20, 2)`

creates an array `A`

using NumPy's `arange()`

function. It starts from 2, increments by 2, and includes values up to 20. This results in the array [2, 4, 6, 8, 10, 12, 14, 16, 18]. Next, the line `B = np.log(A)`

calculates the natural logarithm of each element in array `A`

using NumPy's `log()`

function and stores the results in array `B`

. Finally, `plt.plot(A, B)`

plots the values in array `A`

along the x-axis and the corresponding values in array `B`

along the y-axis using Matplotlib's `plot()`

function.

Execute the following codes and find out what happens ? (Libraries have been imported already ; plt is the alias name for matplotlib.pyplot)

```
A = np.arange(2, 20, 2)
B = np.log(A)
plt.bar(A, B)
```

Will this code produce error ? Why/Why not ?

**Answer**

Executing the provided code will not produce an error. However, the resulting plot might not be as expected because the x-axis values are discrete and categorical, not continuous.

The line `A = np.arange(2, 20, 2)`

creates an array `A`

using NumPy's `arange()`

function. It starts from 2, increments by 2, and includes values up to 20. This results in the array [2, 4, 6, 8, 10, 12, 14, 16, 18]. Next, the line `B = np.log(A)`

calculates the natural logarithm of each element in array `A`

using NumPy's `log()`

function and stores the results in array `B`

. Finally, `plt.bar(A, B)`

creates a bar plot using Matplotlib's `bar()`

function. It plots the values in array `A`

along the x-axis and the corresponding values in array `B`

along the y-axis.

Execute the following codes and find out what happens ? (Libraries have been imported already ; plt is the alias name for matplotlib.pyplot)

```
X = np.arange(1, 18, 2.655)
B = np.log(X)
plt.scatter(X, Y)
```

Will this code produce error ? Why/Why not ?

**Answer**

The code will produce an error because the variable `Y`

is not defined.

The corrected code is:

```
X = np.arange(1, 18, 2.655)
B = np.log(X)
plt.scatter(X, B)
```

The line `X = np.arange(1, 18, 2.655)`

creates an array `X`

using NumPy's `arange()`

function. It starts from 1, increments by 2.655, and generates values less than 18. The resulting array will look like [1., 3.655, 6.31, 8.965, 11.62, 14.275, 16.93]. Next, the line `B = np.log(X)`

calculates the natural logarithm of each element in array `X`

using NumPy's `log()`

function. Finally, the line `plt.scatter(X, Y)`

attempts to use Matplotlib's `scatter()`

function to create a scatter plot. However, `Y`

is not defined in code, leading to a NameError.

Write the output from the given python code :

```
import matplotlib.pyplot as plt
Months = ['Dec', 'Jan', 'Feb', 'Mar']
Attendance = [70, 90, 75, 95]
plt.bar(Months, Attendance)
plt.show()
```

**Answer**

This code snippet uses Matplotlib to create a bar chart. The list `Months`

contains the names of the months ['Dec', 'Jan', 'Feb', 'Mar'], while the list `Attendance`

holds corresponding attendance values [70, 90, 75, 95]. The `plt.bar()`

function is then used to create a bar plot, where each bar represents a month and its height corresponds to the attendance value. Finally, `plt.show()`

is called to display the plot.

Write a program to add titles for the X-axis, Y-axis and for the whole chart in below code.

```
import matplotlib.pyplot as plt
Months = ['Dec', 'Jan', 'Feb', 'Mar']
Attendance = [70, 90, 75, 95]
plt.bar(Months, Attendance)
plt.show()
```

**Answer**

```
import matplotlib.pyplot as plt
Months = ['Dec', 'Jan', 'Feb', 'Mar']
Attendance = [70, 90, 75, 95]
plt.bar(Months, Attendance)
plt.xlabel('Months')
plt.ylabel('Attendance')
plt.title('Attendance Report')
plt.show()
```

plt.plot(A, B) produces (A and B are the sequences same as created in question 1) chart as :

Write code to produce charts as shown below:

**Answer**

```
import numpy as np
import matplotlib.pyplot as plt
A = np.arange(2, 20, 2)
B = np.log(A)
C = np.log(A) * 1.2
plt.plot(A, B)
plt.plot(A, C)
plt.show()
```

```
import numpy as np
import matplotlib.pyplot as plt
A = np.arange(2, 20, 2)
B = np.log(A)
C = np.log(A) * (-1.2)
plt.plot(A, B)
plt.plot(A, C)
plt.show()
```

Write suitable Python code to create 'Favourite Hobby' Bar Chart as shown below :

Also give suitable python statement to save this chart.

**Answer**

```
import matplotlib.pyplot as plt
hobbies = ['Dance', 'Music', 'Painting', 'Playing Sports']
people_count = [300, 400, 100, 500]
plt.bar(hobbies, people_count)
plt.xlabel('Hobbies')
plt.ylabel('Number of People')
plt.title('Favourite Hobby')
plt.savefig('favourite_hobby_chart.png')
plt.show()
```

Consider the following graph. Write the Python code to plot it. Also add the Title, label for X and Y axis.

Using the following data for plotting the graph

```
smarks = [10, 40, 30, 60, 55]
sname = ["Sahil", "Deepak", "Anil", "Ravi", "Riti"]
```

**Answer**

```
import matplotlib.pyplot as plt
smarks = [10, 40, 30, 60, 55]
sname = ["Sahil", "Deepak", "Anil", "Ravi", "Riti"]
plt.plot(sname, smarks)
plt.xlabel('Student Name')
plt.ylabel('Marks Scored')
plt.title('Marks Secured by Students in Term-1')
plt.show()
```

Given a data frame df1 as shown below :

1990 | 2000 | 2010 | |
---|---|---|---|

a | 52 | 340 | 890 |

b | 64 | 480 | 560 |

c | 78 | 688 | 1102 |

d | 94 | 766 | 889 |

Write code to create a scatter chart from the 1990 and 2010 columns of dataframe df1.

**Answer**

```
import pandas as pd
import matplotlib.pyplot as plt
data = {'1990': [52, 64, 78, 94],
'2000': [340, 480, 688, 766],
'2010': [890, 560, 1102, 889]}
df1 = pd.DataFrame(data, index=['a', 'b', 'c', 'd'])
plt.scatter(df1['1990'], df1['2010'])
plt.show()
```

Given a data frame df1 as shown below :

1990 | 2000 | 2010 | |
---|---|---|---|

a | 52 | 340 | 890 |

b | 64 | 480 | 560 |

c | 78 | 688 | 1102 |

d | 94 | 766 | 889 |

Write code to create a line chart from the 1990 and 2000 columns of dataframe df1.

**Answer**

```
import pandas as pd
import matplotlib.pyplot as plt
data = {'1990': [52, 64, 78, 94],
'2000': [340, 480, 688, 766],
'2010': [890, 560, 1102, 889]}
df1 = pd.DataFrame(data, index=['a', 'b', 'c', 'd'])
plt.plot(df1['1990'], df1['2000'])
plt.show()
```

Given a data frame df1 as shown below :

1990 | 2000 | 2010 | |
---|---|---|---|

a | 52 | 340 | 890 |

b | 64 | 480 | 560 |

c | 78 | 688 | 1102 |

d | 94 | 766 | 889 |

Write code to create a bar chart plotting the three columns of dataframe df1.

**Answer**

```
import matplotlib.pyplot as plt
data = {'1990': [52, 64, 78, 94],
'2000': [340, 480, 688, 766],
'2010': [890, 560, 1102, 889]}
df1 = pd.DataFrame(data, index=['a', 'b', 'c', 'd'])
df1.plot(kind = 'bar')
plt.show()
```

The score of four teams in 5 IPL matches is available to you. Write a program to plot these in a bar chart.

**Answer**

```
import matplotlib.pyplot as plt
import numpy as np
Matches = ['Match 1', 'Match 2', 'Match 3', 'Match 4', 'Match 5']
Team_A = [150, 160, 170, 180, 190]
Team_B = [140, 150, 160, 170, 180]
Team_C = [130, 140, 150, 160, 170]
Team_D = [120, 130, 140, 150, 160]
X = np.arange(len(Matches))
plt.bar(Matches, Team_A, width = 0.15)
plt.bar(X + 0.15, Team_B, width = 0.15)
plt.bar(X + 0.30, Team_C, width = 0.15)
plt.bar(X + 0.45, Team_D, width = 0.15)
plt.xlabel('Matches')
plt.ylabel('Scores')
plt.title('IPL Scores')
plt.legend()
plt.show()
```

The score of a team in 5 IPL matches is available to you. Write a program to create a pie chart from this data, showing the last match's performance as a wedge.

**Answer**

```
import matplotlib.pyplot as plt
Matches = ['Match 1', 'Match 2', 'Match 3', 'Match 4', 'Match 5']
Team = [150, 160, 170, 180, 190]
expl = [0, 0, 0, 0, 0.2]
plt.pie(Team, labels = Matches, explode = expl)
plt.title('Team A Scores')
plt.show()
```

The prices of a stock for 3 months are given. Write a program to show the variations in prices for each month by 3 lines on same line chart. Make sure to add legends and labels. Show grid also.

**Answer**

```
import matplotlib.pyplot as plt
months = ['January', 'February', 'March']
prices_stock_A = [100, 120, 110]
prices_stock_B = [90, 110, 100]
prices_stock_C = [95, 115, 105]
plt.plot(months, prices_stock_A, label='Stock A', marker='o')
plt.plot(months, prices_stock_B, label='Stock B', marker='s')
plt.plot(months, prices_stock_C, label='Stock C', marker='^')
plt.xlabel('Months')
plt.ylabel('Prices')
plt.title('Stock Prices Variation')
plt.legend()
plt.grid(True)
plt.show()
```

A distribution data stores about 1000 random number. Write a program to create a scatter chart from this data with varying point sizes.

**Answer**

```
import numpy as np
import matplotlib.pyplot as plt
X = np.random.randint(1, 100, size = (1000,))
Y = np.random.randint(1, 100, size = (1000,))
sizes = np.random.randint(10, 100, size=100)
plt.scatter(X, Y, s = sizes, color = 'r')
plt.show()
```

Navya has started an online business. A list stores the number of orders in last 6 months. Write a program to plot this data on a horizontal bar chart.

**Answer**

```
import matplotlib.pyplot as plt
orders = [150, 200, 180, 250, 300, 220]
months = ['January', 'February', 'March', 'April', 'May', 'June']
plt.barh(months, orders)
plt.xlabel('Number of Orders')
plt.ylabel('Month')
plt.title('Number of Orders in Last 6 Months')
plt.show()
```

Given the following set of data :

```
Weight measurements for 16 small orders of French-fries (in grams).
78 72 69 81 63 67 65 75
79 74 71 83 71 79 80 69
```

Create a simple histogram from the above data.

**Answer**

```
import matplotlib.pyplot as plt
weights = [78, 72, 69, 81, 63, 67, 65, 75, 79, 74, 71, 83, 71, 79, 80, 69]
plt.hist(weights)
plt.title('Weight Distribution of French Fries Orders')
plt.show()
```

Given the following set of data :

```
Weight measurements for 16 small orders of French-fries (in grams).
78 72 69 81 63 67 65 75
79 74 71 83 71 79 80 69
```

Create a horizontal histogram from the above data.

**Answer**

```
import matplotlib.pyplot as plt
weights = [78, 72, 69, 81, 63, 67, 65, 75, 79, 74, 71, 83, 71, 79, 80, 69]
plt.hist(weights, orientation = 'horizontal')
plt.title('Weight Distribution of French Fries Orders')
plt.show()
```

Given the following set of data :

```
Weight measurements for 16 small orders of French-fries (in grams).
78 72 69 81 63 67 65 75
79 74 71 83 71 79 80 69
```

Create a step type of histogram from the above data.

**Answer**

```
import matplotlib.pyplot as plt
weights = [78, 72, 69, 81, 63, 67, 65, 75, 79, 74, 71, 83, 71, 79, 80, 69]
plt.hist(weights, histtype = 'step')
plt.title('Weight Distribution of French Fries Orders')
plt.show()
```

Given the following set of data :

```
Weight measurements for 16 small orders of French-fries (in grams).
78 72 69 81 63 67 65 75
79 74 71 83 71 79 80 69
```

Create a cumulative histogram from the above data.

**Answer**

```
import matplotlib.pyplot as plt
weights = [78, 72, 69, 81, 63, 67, 65, 75, 79, 74, 71, 83, 71, 79, 80, 69]
plt.hist(weights, cumulative = True)
plt.title('Weight Distribution of French Fries Orders')
plt.show()
```

Create an ndarray containing 16 values and then plot this array along with dataset of previous question in same histogram, normal histograms.

**Answer**

```
import numpy as np
import matplotlib.pyplot as plt
weights = [78, 72, 69, 81, 63, 67, 65, 75, 79, 74, 71, 83, 71, 79, 80, 69]
random_array = np.arange(16)
plt.hist(weights)
plt.hist(random_array)
plt.title('Normal Histograms')
plt.show()
```

Create an ndarray containing 16 values and then plot this array along with dataset of previous question in same histogram, cumulative histograms.

**Answer**

```
import numpy as np
import matplotlib.pyplot as plt
weights = [78, 72, 69, 81, 63, 67, 65, 75, 79, 74, 71, 83, 71, 79, 80, 69]
random_array = np.arange(16)
plt.hist(weights, cumulative = True)
plt.hist(random_array, cumulative = True)
plt.title('Cumulative Histograms')
plt.show()
```

Create an ndarray containing 16 values and then plot this array along with dataset of previous question in same histogram, horizontal histograms.

**Answer**

```
import numpy as np
import matplotlib.pyplot as plt
weights = [78, 72, 69, 81, 63, 67, 65, 75, 79, 74, 71, 83, 71, 79, 80, 69]
random_array = np.arange(16)
plt.hist(weights, orientation = 'horizontal')
plt.hist(random_array, orientation = 'horizontal')
plt.title('Horizontal Histograms')
plt.show()
```

Out of above plotted histograms, which ones can be used for creating frequency polygons ? Can you draw frequency polygons from all the above histograms ?

**Answer**

Out of the above plotted histograms, none can be used for creating frequency polygons. We cannot draw frequency polygons from all the above histograms because to construct a frequency polygon, we need a step-type histogram. A frequency polygon is constructed by connecting the midpoints of the tops of the bars of a histogram. Step-type histograms provide a clear outline to draw these connections.

Create/draw frequency polygon from the data used in above questions.

**Answer**

```
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
weights = [78, 72, 69, 81, 63, 67, 65, 75, 79, 74, 71, 83, 71, 79, 80, 69]
plt.figure(figsize = (10, 5))
n, edges, p = plt.hist(weights, bins = 40, histtype = 'step')
m = 0.5 * (edges[1:] + edges[:-1])
m = m.tolist()
l = len(m)
m.insert(0, m[0] - 10)
m.append(m[l-1] + 10)
n = n.tolist()
n.insert(0, 0)
n.append(0)
plt.plot(m, n, '-^')
plt.show()
```

From the following ordered set of data :

```
63, 65, 67, 69, 69, 71, 71, 72, 74, 75, 78, 79, 79, 80, 81, 83
```

Create a horizontal boxplot.

**Answer**

```
import matplotlib.pyplot as plt
data = [63, 65, 67, 69, 69, 71, 71, 72, 74, 75, 78, 79, 79, 80, 81, 83]
plt.boxplot(data, vert = False)
plt.show()
```

From the following ordered set of data :

```
63, 65, 67, 69, 69, 71, 71, 72, 74, 75, 78, 79, 79, 80, 81, 83
```

Create a vertical boxplot.

**Answer**

```
import matplotlib.pyplot as plt
data = [63, 65, 67, 69, 69, 71, 71, 72, 74, 75, 78, 79, 79, 80, 81, 83]
plt.boxplot(data)
plt.show()
```

From the following ordered set of data :

```
63, 65, 67, 69, 69, 71, 71, 72, 74, 75, 78, 79, 79, 80, 81, 83
```

Show means in the boxplot.

**Answer**

```
import matplotlib.pyplot as plt
data = [63, 65, 67, 69, 69, 71, 71, 72, 74, 75, 78, 79, 79, 80, 81, 83]
plt.boxplot(data, showmeans = True)
plt.show()
```

From the following ordered set of data :

```
63, 65, 67, 69, 69, 71, 71, 72, 74, 75, 78, 79, 79, 80, 81, 83
```

Create boxplot without the box.

**Answer**

```
import matplotlib.pyplot as plt
data = [63, 65, 67, 69, 69, 71, 71, 72, 74, 75, 78, 79, 79, 80, 81, 83]
plt.boxplot(data, showbox = False)
plt.show()
```

Sina has created ordered set of data from the number of new customers registered on his online service centre in last 20 months.

Write a program to plot this data on a filled boxplot with means shown.

**Answer**

```
import matplotlib.pyplot as plt
data = [100, 120, 95, 110, 105, 130, 115, 125, 135, 140, 120, 110, 105, 115, 130, 125, 110, 115, 120, 135]
plt.boxplot(data, patch_artist = True, showmeans = True)
plt.title('Number of New Customers Registered in Last 20 Months')
plt.show()
```