DOAJ Open Access 2024

Efficient List Intersection Algorithm for Short Documents by Document Reordering

Lianyin Jia Dongyang Li Haihe Zhou Fengling Xia

Abstrak

List intersection plays a pivotal role in various domains such as search engines, database systems, and social networks. Efficient indexes and query strategies can significantly enhance the efficiency of list intersection. Existing inverted index-based algorithms fail to utilize the length information of documents and require excessive list intersections, resulting in lower efficiency. To address this issue, in this paper, we propose the LDRpV (Length-based Document Reordering plus Verification) algorithm. LDRpV filters out documents that are unlikely to satisfy the intersection results by reordering documents based on their length, thereby reducing the number of candidates. Additionally, to minimize the number of list intersection operations, an intersection and verification strategy is designed, where only the first <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mi>m</mi></semantics></math></inline-formula> lists are intersected, and the resulting candidate set is directly verified. This approach effectively improves the efficiency of list intersection. Experimental results on four real datasets demonstrate that LDRpV can achieve a maximum efficiency improvement of 46.69% compared to the most competitive counterparts.

Topik & Kata Kunci

Penulis (4)

L

Lianyin Jia

D

Dongyang Li

H

Haihe Zhou

F

Fengling Xia

Format Sitasi

Jia, L., Li, D., Zhou, H., Xia, F. (2024). Efficient List Intersection Algorithm for Short Documents by Document Reordering. https://doi.org/10.3390/math12091328

Akses Cepat

PDF tidak tersedia langsung

Cek di sumber asli →
Lihat di Sumber doi.org/10.3390/math12091328
Informasi Jurnal
Tahun Terbit
2024
Sumber Database
DOAJ
DOI
10.3390/math12091328
Akses
Open Access ✓