KnowledgeBoat Logo
|

Informatics Practices

Consider the following tables Item and Customer and answer the questions that follow:

Table: Item

Item_IDItemNameManufacturerPrice
PC01Personal ComputerHCL India42000
LCO5LaptopHP USA55000
PCO3Personal ComputerDell USA32000
PC06Personal ComputerZenith USA37000
LCO3LaptopDell USA57000

Table: Customer

Item_IDCustomerNameCity
LCO3N RoyDelhi
PCO3H SinghMumbai
PC06R PandeyDelhi
LCO3C SharmaChennai
PC01K AgarwalBengaluru

Assume that the Pandas has been imported as pd.

(a) Create a dataframe called dfI for table Item.

(b) Create a dataframe called dfC for table Customer.

(c) Perform the default join operation on item_ID using two dataframes: dfI and dfC.

(d) Perform the left join operation on item_ID using two dataframes: dfI and dfC.

(e) Perform the right join operation on Item_ID using two dataframes: dfI and dfC.

(f) Perform the default operation on Item_ID using two dataframes: dfI and dfC with the left index as true.

(g) Perform the outer join operation on item_ID using two dataframes: dfI and dfC.

(h) Create a new dataframe dfN using dataframes: dfI and dfC. The new dataframe data will hold both left index and right index true values.

(i) Arrange the dataframe dfN in descending order of Price.

(j) Arrange the dataframe dfN in descending order of City and Price.

Python Data Handling

2 Likes

Answer

(a)

dfI = pd.DataFrame({
    'Item_ID': ['PC01', 'LCO5', 'PCO3', 'PC06', 'LCO3'],
    'ItemName': ['Personal Computer', 'Laptop', 'Personal Computer', 'Personal Computer', 'Laptop'],
    'Manufacturer': ['HCL India', 'HP USA', 'Dell USA', 'Zenith USA', 'Dell USA'],
    'Price': [42000, 55000, 32000, 37000, 57000]
})
Output
  Item_ID           ItemName Manufacturer  Price
0    PC01  Personal Computer    HCL India  42000
1    LCO5             Laptop       HP USA  55000
2    PCO3  Personal Computer     Dell USA  32000
3    PC06  Personal Computer   Zenith USA  37000
4    LCO3             Laptop     Dell USA  57000

(b)

dfC = pd.DataFrame({
    'Item_ID': ['LCO3', 'PCO3', 'PC06', 'LCO3', 'PC01'],
    'CustomerName': ['N Roy', 'H Singh', 'R Pandey', 'C Sharma', 'K Agarwal'],
    'City': ['Delhi', 'Mumbai', 'Delhi', 'Chennai', 'Bengaluru']
})
Output
  Item_ID CustomerName       City
0    LCO3        N Roy      Delhi
1    PCO3      H Singh     Mumbai
2    PC06     R Pandey      Delhi
3    LCO3     C Sharma    Chennai
4    PC01    K Agarwal  Bengaluru

(c)

df_default_join = pd.merge(dfI, dfC, on = 'Item_ID')
Output
  Item_ID           ItemName Manufacturer  Price CustomerName       City
0    PC01  Personal Computer    HCL India  42000    K Agarwal  Bengaluru
1    PCO3  Personal Computer     Dell USA  32000      H Singh     Mumbai
2    PC06  Personal Computer   Zenith USA  37000     R Pandey      Delhi
3    LCO3             Laptop     Dell USA  57000        N Roy      Delhi
4    LCO3             Laptop     Dell USA  57000     C Sharma    Chennai

(d)

df_left_join = pd.merge(dfI, dfC, left_on='Item_ID', right_on='Item_ID', how='left')
Output
  Item_ID           ItemName Manufacturer  Price CustomerName       City
0    PC01  Personal Computer    HCL India  42000    K Agarwal  Bengaluru
1    LCO5             Laptop       HP USA  55000          NaN        NaN
2    PCO3  Personal Computer     Dell USA  32000      H Singh     Mumbai
3    PC06  Personal Computer   Zenith USA  37000     R Pandey      Delhi
4    LCO3             Laptop     Dell USA  57000        N Roy      Delhi
5    LCO3             Laptop     Dell USA  57000     C Sharma    Chennai

(e)

df_right_join = pd.merge(dfI, dfC, left_on='Item_ID', right_on='Item_ID', how='right')
Output
  Item_ID           ItemName Manufacturer  Price CustomerName       City
0    LCO3             Laptop     Dell USA  57000        N Roy      Delhi
1    PCO3  Personal Computer     Dell USA  32000      H Singh     Mumbai
2    PC06  Personal Computer   Zenith USA  37000     R Pandey      Delhi
3    LCO3             Laptop     Dell USA  57000     C Sharma    Chennai
4    PC01  Personal Computer    HCL India  42000    K Agarwal  Bengaluru

(f) This operation is not possible because the left_index parameter is used for merging on the index of the dataframes, not on a specific column. The default merge operation is an inner join, and it requires a common column to merge on.

(g)

df_outer_join = pd.merge(dfI, dfC, on='Item_ID', how='outer')
Output
  Item_ID           ItemName Manufacturer  Price CustomerName       City
0    LCO3             Laptop     Dell USA  57000        N Roy      Delhi
1    LCO3             Laptop     Dell USA  57000     C Sharma    Chennai
2    LCO5             Laptop       HP USA  55000          NaN        NaN
3    PC01  Personal Computer    HCL India  42000    K Agarwal  Bengaluru
4    PC06  Personal Computer   Zenith USA  37000     R Pandey      Delhi
5    PCO3  Personal Computer     Dell USA  32000      H Singh     Mumbai

(h)

dfN = pd.merge(dfI, dfC, left_index=True, right_index=True)
Output
Item_ID_x           ItemName Manufacturer  Price Item_ID_y CustomerName       City
0      PC01  Personal Computer    HCL India  42000      LCO3        N Roy      Delhi
1      LCO5             Laptop       HP USA  55000      PCO3      H Singh     Mumbai
2      PCO3  Personal Computer     Dell USA  32000      PC06     R Pandey      Delhi
3      PC06  Personal Computer   Zenith USA  37000      LCO3     C Sharma    Chennai
4      LCO3             Laptop     Dell USA  57000      PC01    K Agarwal  Bengaluru

(i)

dfN_sorted_price = dfN.sort_values(by='Price', ascending=False)
Output
  Item_ID_x           ItemName Manufacturer  Price Item_ID_y CustomerName       City
4      LCO3             Laptop     Dell USA  57000      PC01    K Agarwal  Bengaluru
1      LCO5             Laptop       HP USA  55000      PCO3      H Singh     Mumbai
0      PC01  Personal Computer    HCL India  42000      LCO3        N Roy      Delhi
3      PC06  Personal Computer   Zenith USA  37000      LCO3     C Sharma    Chennai
2      PCO3  Personal Computer     Dell USA  32000      PC06     R Pandey      Delhi

(j)

dfN_sorted_city_price = dfN.sort_values(by=['City', 'Price'], ascending=[False, False])
Output
  Item_ID_x           ItemName Manufacturer  Price Item_ID_y CustomerName       City
1      LCO5             Laptop       HP USA  55000      PCO3      H Singh     Mumbai
0      PC01  Personal Computer    HCL India  42000      LCO3        N Roy      Delhi
2      PCO3  Personal Computer     Dell USA  32000      PC06     R Pandey      Delhi
3      PC06  Personal Computer   Zenith USA  37000      LCO3     C Sharma    Chennai
4      LCO3             Laptop     Dell USA  57000      PC01    K Agarwal  Bengaluru

Answered By

2 Likes


Related Questions