Question 5 (35 points) In this question, we use the Pima Indians Diabetes Databa

Question 5 (35 points)
In this question, we use the Pima Indians Diabetes Database that was used in Assignment 1 and
only consider the attributes Pregnancies, Glucose (discretized into groups according to the
highest digit, i.e., [0, 9], [10, 19], …, [190, 199]), BloodPressure (discretized into groups according
to the highest digit), age (discretized into groups according to the highest digit), and Outcome.
Treat each record in the database as a transaction. Please note that two attributes may take the
same value, e.g., 0, but the same value on different attributes should be treated as different
items.
Find the 100 most frequent itemsets such that each itemset found should contain an item in
attribute Outcome. You can write your own program or use any existing program available on
the web or open-source suites. Please give reference to the programs that are not developed by
you. You can use either FP-growth, Apriori, or any variants. Show the code.
1. (10 points) Describe your approach, particularly, how do you modify the original FP-
growth or Apriori algorithms to ensure each frequent pattern contains an item in attribute
Outcome.
2. (10 points) Report the 100 most frequent itemsets found.
3. (15 points) Using the 100 most frequent itemsets found, can you form 5 association rules
with the highest confidence? Describe your method and list the rules.

Place this order or similar order and get an amazing discount. USE Discount code “GET20” for 20% discount