How to Use rs-trafilatura with Firecrawl

By Steel Nova · April 3, 2026 · 1 min read

Firecrawl is an API service for scraping web pages. It handles JavaScript rendering, anti-bot bypass, and rate limiting — you send it a URL, it gives you back the page content. By default, Firecrawl returns Markdown. But if you request the raw HTML, you can run rs-trafilatura on it for page-type-aware extraction with quality scoring. This is useful when you need structured metadata (title, author, date, page type) or when you want to know how confident the extraction is. Install pip install rs-trafilatura firecrawl You also need a Firecrawl API key from firecrawl.dev. Basic Usage from firecrawl import FirecrawlApp from rs_trafilatura.firecrawl import extract_firecrawl_result app = FirecrawlApp(api_key="fc-your-api-key") # Request HTML format (required for rs-trafilatura) result = app.scrape("https://example.com/blog/post", formats=["html"]) # Extract with rs-trafilatura extracted = extract_firecrawl_result(result) print(f"Title: {extracted.title}") print(f"Author: {extracted.author}")

How to Use rs-trafilatura with Firecrawl

Related Posts

Trending on ShareHub

Latest on ShareHub

Browse Topics

Around the Network