Analisis Perbandingan Algoritma FP-Growth dan CP-Tree untuk Data Teks

Dian Saâ€™adillah Maylawati

doi:10.33364/algoritma/v.15-1.1

Telah diserahkan

Oct 11, 2019

Diterbitkan

Mar 14, 2018

Download

Download PDF (English)

Statistic

Read Counter : 466 Download : 474

Abstrak

Frequent Pattern Growth (FP-Growth) dan Compact Pattern Tree (CP-Tree) adalah algoritma Frequent Itemset Mining (FIM) yang menghasilkan frequent itemset dari transaksi database. Frequent itemset dapat digunakan sebagai representasi terstruktur untuk data teks yang merupakan data tidak terstruktur atau semi terstruktur. CP-Tree adalah algoritma FIM yang dikembangkan dari algoritma FP-Growth. Namun, CP-Tree melakukan proses data secara inkremental sedangkan FP-Growth non-inkremental. Artikel ini membahas analisis terhadap algoritma FP-Growth dan CP-Tree dalam menghasilkan representasi terstruktur dari data teks. Berdasarkan hasil analisis dan evaluasi terhadap algoritma FP-Growth CP-Tree diperoleh bahwa frequent itemset yang dihasilkan dari representasi pohon kedua algoritma tersebut sama. Secara proses algoritma FP-Growth lebih sederhana dibandingkan algoritma CP-Tree. Namun, algoritma CP-Tree lebih fleksibel terhadap penambahan transaksi baru dibandingkan algoritma FP-Growth. Hal ini dikarenakan CP-Tree tidak mengulang dari awal untuk proses scanning data dan membuat struktur pohon seperti FP-Growth apabila ada data transaksi baru.

Cara Mengutip

[1]

Dian Saâ€™adillah Maylawati, “Analisis Perbandingan Algoritma FP-Growth dan CP-Tree untuk Data Teks”, Jurnal Algoritma, vol. 15, no. 1, hlm. 1–6, Mar 2018.

Unduh Sitasi

References

[1] M. A. Ramdhani, H. Aulawi, A. Ikhwana, and Y. Mauluddin, “Model of green technology adaptation in small and
medium-sized tannery industry,” J. Eng. Appl. Sci., vol. 12, no. 4, pp. 954”“962, 2017.
[2] Jumadi, D. S. A. Maylawati, B. Subaeki, and T. Ridwan, “Opinion mining on Twitter microblogging using Support
Vector Machine: Public opinion about State Islamic University of Bandung,” in Proceedings of 2016 4th International
Conference on Cyber and IT Service Management, CITSM 2016, 2016.
[3] D. Sa”™Adillah Maylawati and G. A. Putri Saptawati, “Set of Frequent Word Item sets as Feature Representation for Text
with Indonesian Slang,” in Journal of Physics: Conference Series, 2017, vol. 801, no. 1.
[4] D. S. A. Maylawati, “PEMBANGUNAN LIBRARY PRE-PROCESSING UNTUK TEXT MINING DENGAN
REPRESENTASI HIMPUNAN FREQUENT WORD ITEMSET (HFWI) Studi Kasus: Bahasa Gaul Indonesia,”
Bandung, 2015.
[5] A. Pamoragung, K. Suryadi, and M. A. Ramdhani, “Enhancing the implementation of e-Government in indonesia
through the high-quality of virtual community and knowledge portal,” in Proceedings of the European Conference on eGovernment, ECEG, 2006, pp. 341”“348.
[6] M. A. Ramdhani, Metodologi Penelitian untuk Riset Teknologi Informasi. Bandung: UIN Sunan Gunung Djati Bandung,
2013.
[7] D. S. Maylawati, W. Darmalaksana, and M. A. Ramdhani, “Systematic Design of Expert System Using Unified
Modelling Language,” IOP Conf. Ser. Mater. Sci. Eng., vol. 288, no. 1, p. 12047, 2018.
[8] H. Aulawi, M. A. Ramdhani, C. Slamet, H. Ainissyifa, and W. Darmalaksana, “Functional Need Analysis of Knowledge
Portal Design in Higher Education Institution,” Int. Soft Comput., vol. 12, no. 2, pp. 132”“141, 2017.
[9] C. Slamet, A. Rahman, A. Sutedi, W. Darmalaksana, M. A. Ramdhani, and D. S. Maylawati, “Social Media-Based
Identifier for Natural Disaster,” IOP Conf. Ser. Mater. Sci. Eng., vol. 288, no. 1, p. 12039, 2018.
[10] C. Slamet, R. Andrian, D. S. Maylawati, W. Darmalaksana, and M. A. Ramdhani, “Web Scraping and NaÃ¯ve Bayes
Classification for Job Search Engine,” vol. 288, no. 1, pp. 1”“7, 2018.
[11] Y. A. Gerhana, W. B. Zulfikar, A. H. Ramdani, and M. A. Ramdhani, “Implementation of Nearest Neighbor using HSV
to Identify Skin Disease,” IOP Conf. Ser. Mater. Sci. Eng., vol. 288, no. 1, p. 012153 1234567890 Implementation, 2018.
[12] A. Rahman, C. Slamet, W. Darmalaksana, Y. A. Gerhana, and M. A. Ramdhani, “Expert System for Deciding a Solution
of Mechanical Failure in a Car using Case-based Reasoning,” IOP Conf. Ser. Mater. Sci. Eng., vol. 288, no. 1, p. 12011,
2018.
[13] C. Slamet, A. Rahman, M. A. Ramdhani, and W. Darmalaksana, “Clustering the Verses of the Holy Qur”™an Using KMeans Algorithm,” Asian J. Inf. Technol., vol. 15, no. 24, pp. 5159”“5162, 2016.
[14] D. S. Maylawati, M. A. Ramdhani, W. B. Zulfikar, I. Taufik, and W. Darmalaksana, “Expert system for predicting the
early pregnancy with disorders using artificial neural network,” in 2017 5th International Conference on Cyber and IT
Service Management, CITSM 2017, 2017.
[15] W. B. Zulfikar, Jumadi, P. K. Prasetyo, and M. A. Ramdhani, “Implementation of Mamdani Fuzzy Method in Employee
Promotion System,” IOP Conf. Ser. Mater. Sci. Eng., vol. 288, no. 1, p. 12147, 2018.
[16] D. S. A. Maylawati, M. A. Ramdhani, A. Rahman, and W. Darmalaksana, “Incremental technique with set of frequent
word item sets for mining large Indonesian text data,” in 2017 5th International Conference on Cyber and IT Service
Management, CITSM 2017, 2017.
[17] A. Taofik, N. Ismail, Y. A. Gerhana, K. Komarujaman, and M. A. Ramdhani, “Design of Smart System to Detect
Ripeness of Tomato and Chili with New Approach in Data Acquisition,” in IOP Conference Series: Materials Science
and Engineering, 2018, vol. 288, no. 1, p. 12018.
[18] H. Mahgoub, D. RÃ¶sner, N. Ismail, and F. Torkey, “A Text Mining Technique Using Association Rules Extraction,” Int.
J. Comput. Intell., vol. 4, no. 1, pp. 21”“28, 2008.
[19] V. Gupta and G. S. Lehal, “A survey of text mining techniques and applications,” Journal of Emerging Technologies in
Web Intelligence, vol. 1, no. 1. pp. 60”“76, 2009.
[20] V. Gupta and G. S. Lehal, “A Survey of Text Summarization Extractive techniques,” in Journal of Emerging
Technologies in Web Intelligence, 2010, vol. 2, no. 3, pp. 258”“268.
[21] C. J. Torre, M. J. Martin-Bautista, D. Sanchez, and I. Blanco, “Text Knowledge Mining: And Approach To Text Mining,”
ESTYLF08, vol. 17”“19, 2008.
[22] A.-H. Tan, “Text Mining: The state of the art and the challenges,” in Proceedings of the PAKDD 1999 Workshop on
Knowledge Disocovery from Advanced Databases, 1999, vol. 8, pp. 65”“70.
[23] H. Jiawei, M. Kamber, J. Han, M. Kamber, and J. Pei, Data Mining: Concepts and Techniques. 2006.
[24] H. Jiawei, M. Kamber, J. Han, M. Kamber, and J. Pei, Data Mining: Concepts and Techniques. 2012.
[25] S. M. Weiss, N. Indurkhya, T. Zhang, and F. J. Damerau, “Information Retrieval and Text Mining,” Springer Berlin
Heidelb., no. Fundamentals of Predictive Text Mining, pp. 75”“90, 2010.
[26] H. M. Wallach, “Topic Modeling: Beyond Bag-of-Words,” ICML, no. 1, pp. 977”“984, 2006.
[27] A. Sethy and B. Ramabhadran, “Bag-of-word normalized n-gram models,” in Proceedings of the Annual Conference of
the International Speech Communication Association, INTERSPEECH, 2008, pp. 1594”“1597.
[28] W. Pu, N. Liu, S. Yan, J. Yan, K. Xie, and Z. Chen, “Local word bag model for text categorization,” in Proceedings -
IEEE International Conference on Data Mining, ICDM, 2007, pp. 625”“630.
[29] A. Doucet and H. Ahonen-Myka, “An efficient any language approach for the integration of phrases in document
retrieval,” Lang. Resour. Eval., vol. 44, no. 1”“2, pp. 159”“180, 2010.
[30] A. Doucet and H. Ahonen-Myka, “Non-contiguous word sequences for information retrieval,” MWE ”™04 Proc. Work.
Multiword Expressions, vol. 26, no. July, pp. 88”“95, 2004.
[31] H. Ahonen-Myka, “Discovery of Frequent Word Sequences in Text,” Proc. ESF Explor. Work. Pattern Detect. Discov.,
vol. {LNCS} (24, no. Teollisuuskatu 23, pp. 180”“189, 2002.
[32] H. Ahonen-Myka, “Finding All Maximal Frequent Sequences in Text,” Proc. ICML Work. Mach. Learn. Text Data Anal.,
pp. 11”“17, 1999.
[33] R. Agrawal and R. Srikant, “Fast Algorithms for Mining Association Rules in Large Databases,” J. Comput. Sci.
Technol., vol. 15, no. 6, pp. 487”“499, 1994.
[34] J. Han, H. Cheng, D. Xin, and X. Yan, “Frequent pattern mining: Current status and future directions,” Data Min. Knowl.
Discov., vol. 15, no. 1, pp. 55”“86, 2007.
[35] P. Fournier-Viger, J. C. W. Lin, B. Vo, T. T. Chi, J. Zhang, and H. B. Le, “A survey of itemset mining,” Wiley
Interdisciplinary Reviews: Data Mining and Knowledge Discovery, vol. 7, no. 4. 2017.
[36] R. Agrawal, H. Mannila, R. Srikant, H. Toivonen, and a I. Verkamo, “Fast discovery of association rules,” Advances in
knowledge discovery and data mining, vol. 12. pp. 307”“328, 1996.
[37] F. KovÃ¡cs and J. IllÃ©s, “Frequent itemset mining on hadoop,” Comput. Cybern. (ICCC), 2013 IEEE 9th Int. Conf., pp.
241”“245, 2013.
[38] S. Moens, E. Aksehirli, and B. Goethals, “Frequent Itemset Mining for Big Data,” in 2013 IEEE International
Conference on Big Data, 2013, pp. 111”“118.
[39] J. Han, J. Pei, and Y. Yin, “Mining frequent patterns without candidate generation,” in Proceedings of the 2000 ACM
SIGMOD international conference on Management of data - SIGMOD ”™00, 2000, pp. 1”“12.
[40] S. K. Tanbeer, C. F. Ahmed, B.-S. Jeong, and Y.-K. Lee, “Efficient frequent pattern mining over data streams,” in
Proceeding of the 17th ACM conference on Information and knowledge mining - CIKM ”™08, 2008, p. 1447.
[41] S. K. Tanbeer, C. F. Ahmed, B. S. Jeong, and Y. K. Lee, “Efficient single-pass frequent pattern mining using a prefixtree,” Inf. Sci. (Ny)., vol. 179, no. 5, pp. 559”“583, 2009.

Analisis Perbandingan Algoritma FP-Growth dan CP-Tree untuk Data Teks

Bilah Samping Artikel

Isi Artikel Utama

Abstrak

Rincian Artikel

References

References