In this assignment, you will use Python’s mlxtend.frequent_patterns to find the
In this assignment, you will use Python’s mlxtend.frequent_patterns to find the association rules satisfying given lift, confidence and support threshold for the list of transactions available at ‘online_retail.csv’ file.
Step 1 (35 points): Clean the data byremoving rows whose StockCode or Invoice values contain non-digit characters
removing rows whose Price values are less than 10
removing rows whose country values are not equal to “United Kingdom”, “Italy”, “France”, “Germany”, “Norway”, “Finland”, “Austria”, “Belgium”, “European Community”, “Cyprus”, “Greece”, “Iceland”, “Malta”, “Netherlands”, “Portugal”, “Spain”, “Sweden”, or “Switzerland”.
removing rows whose quantity values are negative.
trimming the description using string.strip function
Step 2 (30 points) Find the frequent itemsets with min_support = 0.01
Step 3 (35 points) Find the association rules with confidence greater than 10%. Among them, which rule(s) has the highest value of lift?
Deliverables
Submit a zip file containing your Python code and a “report.pdf” answering the question asked in step 3.