Web-Harvest
Java web data extraction and scraping framework. XML config; XPath, XQuery, regex; HTTP client, HTML/XML parsing, plugin system. CLI and web IDE; outputs WARC-style and structured data.
Metadata
Sponsored Ad
Java web data extraction and scraping framework. XML config; XPath, XQuery, regex; HTTP client, HTML/XML parsing, plugin system. CLI and web IDE; outputs WARC-style and structured data.