Skip to content

Latest commit

 

History

History
33 lines (18 loc) · 952 Bytes

Readme.md

File metadata and controls

33 lines (18 loc) · 952 Bytes

Purpose of this project is to collect various pandas library table operations in one place. You may find examples for following operations

LEFT JOIN

pandas.merge(orders, barcodes, how='left', on='order_id')

GROUP BY

Create list of following format:

| customer_id | order_id1 | [barcode1, barcode2, ...] | customer_id | order_id2 | [barcode1, barcode2, ...]

joinResult.groupby(['customer_id', 'order_id'])['barcode'].apply(list).reset_index(name='barcodes')

Filter duplicate barcodes:

barcodes.groupby(['barcode']).filter(lambda x: len(x) > 1)

TOP 5 RECORDS

Gets top 5 customers who purchased barcodes

joinResult.groupby(['customer_id'])['barcode'].count().nlargest(5).reset_index(name='total_tickets')

GETTING STARTED

  • Install pandas: pip install pandas
  • Install MagicMock: pip install MagicMock
  • Run application: python pandas-sample.py
  • Run unit tests: python pandas-sample_unittest.py