Implementasi Algoritma Ratcliff/Obershelp pada Pendeteksian Similaritas Dokumen
DOI:
https://doi.org/10.51454/decode.v4i3.643Keywords:
Dokumen, Plagiarsm, Ratcliff/Obershelp, SimilaritasAbstract
Plagiarism merupakan tindakan menjiplak karya orang lain dan mengakui sebagai hasil karya pribadinya. Pada penelitian ini melakukan pendeteksian similaritas dari dokumen dengan menghitung similaritas dokumen menggunakan algoritma Ratcliff/Obershelp. Tipe dokumen yang diuji adalah .pdf. Dokumen yang digunakan untuk perbandingan teks ini adalah dokumen yang berbahasa Indonesia. Tahapan preprocessing pada penelitian ini dilakukan dengan menghitung nilai similaritas yang terdiri dari case folding, tokenisasi, filtering, dan stemming. Setelah proses preprocessing maka tahap selanjutnya adalah dilakukan perhitungan menggunakan algoritma Ratcliff/Obershelp. Pada pengujian menggunakan 150 data dokumen yang akan dihitung nilai similaritasnya menggunakan algoritma Ratcliff/Obershelp menghasilkan nilai similaritas dokumen dengan tingkat kemiripan berdasarkan tiga kategori (tinggi, sedang dan rendah). Hasil deteksi similaritas pada penelitian ini diharapkan dapat membantu proses pengerjaan pendeteksian perbandingan dua buah dokumen dalam jumlah yang sangat banyak.
References
Apridiansyah, Y., Wijaya, A., & Purjiawan, A. (2022). Penerapan Fungsi Metode Rolling Hash Pada Algoritma Winnowing Untuk Mendeteksi Kemiripan Teks Abstrak Berbasis Web. Jurnal Media Infotama, 18(1), 128–133.
Balani, Z., & Varol, C. (2021). Combining Approximate String Matching Algorithms and Term Frequency In The Detection of Plagiarism. International Journal of Computer Science and Security (IJCSS), 15(4), 97–105.
El-Rashidy, M. A., Mohamed, R. G., El-Fishawy, N. A., & Shouman, M. A. (2024). An effective text plagiarism detection system based on feature selection and SVM techniques. In Multimedia Tools and Applications (Vol. 83, Issue 1). Springer US. https://doi.org/10.1007/s11042-023-15703-4
Enni Lindrawati, Ema Utami, & Yaqin, A. (2023). ANoM STEMMER: Nazief & Andriani Modification for Madurese Stemming. Jurnal RESTI (Rekayasa Sistem Dan Teknologi Informasi), 7(6), 1341–1347. https://doi.org/10.29207/resti.v7i6.5086
Fadlil, A., Sunardi, S., & Ramdhani, R. (2022). Similarity Identification Based on Word Trigrams Using Exact String Matching Algorithms. INTENSIF: Jurnal Ilmiah Penelitian Dan Penerapan Teknologi Sistem Informasi, 6(2), 253–270. https://doi.org/10.29407/intensif.v6i2.18141
Fauzi, R., Iqbal, M., & Haryanti, T. (2022). Design and Implementation of a Final Project Plagiarism Detection System Using Cosine Similarity Method. IJAIT (International Journal of Applied Information Technology), 05(02), 1. https://doi.org/10.25124/ijait.v5i02.4146
Hiebel, N., Ferret, O., Fort, K., & Névéol, A. (2022). CLISTER : A Corpus for Semantic Textual Similarity in French Clinical Narratives. Traitement Automatique Des Langues Naturelles, TALN 2022 - Actes de La 29e Conference Sur Le Traitement Automatique Des Langues Naturelles: Conference Principale, 1(June), 287–296.
Izzah, N., Yusliani, N., & Roodiah, D. (2022). Sistem Deteksi Kemiripan Teks Pada Berita Berbahasa Indonesia Menggunakan algoritma Ratcliff/Obershelp. Jurnal Linguistik Komputasional (JLK), 5(1), 1. https://doi.org/10.26418/jlk.v5i1.65
Leonardo, B., & Hansun, S. (2017). Text documents plagiarism detection using Rabin-Karp and Jaro-Winkler distance algorithms. Indonesian Journal of Electrical Engineering and Computer Science, 5(2), 462–471. https://doi.org/10.11591/ijeecs.v5.i2.pp462-471
Najm Mansoor, M., & Al-Tamimi, M. S. H. (2022). Computer-based plagiarism detection techniques: A comparative study. Int. J. Nonlinear Anal. Appl, 13(February), 2008–6822. https://doi.org/10.22075/ijnaa.2022.6140
Nuraminah, A., & Ammar, A. (2023). Damerau-Levenshtein Distance Algorithm Based on Abstract Syntax Tree to Detect Code Plagiarism. Scientific Journal of Informatics, 11(1), 11–20. https://doi.org/10.15294/sji.v11i1.48064
Omar, K., Esmaeel, N., & Ebrahim, Z. (2024). Intelligent Systems And Applications In Engineering Review on Plagiarism Detection Systems , Algorithms , Weakness Points.
Özçevik, Y., Yücalar, F., & Demircioğlu, M. (2022). Determining a Proper Text Similarity Approach for Resume Parsing Process in a Digitized HR Software. Celal Bayar University Journal of Science, 18(4), 371–378. https://doi.org/10.18466/cbayarfbe.1049845
Prawira, J., & Saputri, T. R. D. (2024). Lost item identification model development using similarity prediction method with CNN ResNet algorithm. Journal of Autonomous Intelligence, 7(2), 1–14. https://doi.org/10.32629/jai.v7i2.1381
Puji Agung Kurniawan, M. A. E. N. (2023). Jurnal ITCC (Information Technology and Cyber Crime). Jurnal ITCC, 2(1), 2964–755.
Putera Utama Siahaan, A., Rahim, R., & Siregar, D. (2017). K-Gram As A Determinant Of Plagiarism Level In Rabin-Karp Algorithm. International Journal of Scientific & Technology Research, 6(07), 7.
Rahmatulloh, A., Kurniati, N. I., Darmawan, I., Asyikin, A. Z., & Witarsyah, J. D. (2019). Comparison between the stemmer porter effect and nazief-adriani on the performance of winnowing algorithms for measuring plagiarism. International Journal on Advanced Science, Engineering and Information Technology, 9(4), 1124–1128. https://doi.org/10.18517/ijaseit.9.4.8844
Saeed, A. A. M., & Taqa, A. Y. (2022). A proposed approach for plagiarism detection in Article documents. SinkrOn, 7(2), 568–578. https://doi.org/10.33395/sinkron.v7i2.11381
Thombare, V., Joshi, S., & Deshpande, L. (2024). AI ML Techniques To Find Nearest Matching Records. 11(5), 450–453.
Turnitin. (2021). Understanding the Turnitin Similarity Report A student guide. 1–2. https://help.turnitin.com/Resources/PDF/understanding_the_turnitin_similarity_report-a_student_guide.pdf
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2024 Maukar, Ety Sutanty, Rini Arianty, Esti Setiyaningsih

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.