STAVIES (algorithm)
From Wikipedia, the free encyclopedia
STAVIES is a proposed algorithm for extracting information from the World Wide Web.
The main innovation and contribution of the proposed system consists in introducing a signal-wise treatment of the tag structural hierarchy and using hierarchical clustering techniques to segment the web pages. STAVIES can operate without human intervention and does not require any training.
[edit] Sources
Papadakis, Nikolaos; Dimitrios Skoutas, Κonstantinos Raftopoulos and Theodora Varvarigou (December 2005). "STAVIES: A System for Information Extraction from Unknown Web Data Sources through Automatic Web Wrapper Generation, using Clustering Techniques". IEEE Transactions on Knowledge and Data Engineering 17 (12): 1638-1652.