Back to all projects
Real estate investment firm

Real estate investment CRM with integrated scraping & price monitoring

Built a full-stack CRM for a real estate investment firm that pulls live property data from Zillow, InvestorLift, county appraisers, and permit systems — enriches it with AI scoring — and tracks deals from discovery through renovation to sale.

Python Scrapy React OpenAI
5
Data sources
30+
Counties enriched
100+
Fields per property
24/7
Uptime
The Challenge

The client needed a single system that could discover investment properties from multiple marketplaces, enrich each property with county appraiser data and permit history, monitor price changes around the clock, and manage the full deal lifecycle — all without switching between a dozen tabs and spreadsheets.

The Approach

I built a three-tier system: a React CRM frontend for deal management, a serverless backend for business logic, and a dedicated scraper layer running on its own server that pulls from Zillow, InvestorLift, county appraiser systems, and permit databases — then feeds everything into the CRM automatically.

The system

This isn’t a scraper — it’s a complete investment operations platform. The architecture has three layers that work together:

Frontend (React CRM) — a full-featured web application where the team manages properties, tracks deals, schedules renovations, signs contracts, and monitors price changes. Covers the entire investment workflow.

Serverless backend — custom functions handling data ingestion, Google Drive integration, calendar scheduling, payment reminders, ROI calculations, and deal workflow automation.

Scraper layer (dedicated server) — three always-on services: a Zillow crawler, an InvestorLift crawler, and a price monitoring daemon. Each runs independently, feeds data into the CRM through an ingestion pipeline, and handles its own deduplication and error recovery.


Data sources

The system pulls from multiple external data sources and combines them into a single, enriched property profile in the CRM.

Zillow

The primary source for on-market properties. The crawler runs scheduled jobs filtered by city, county, state, price range, and custom criteria.

Zillow crawler main dashboard

  • Data: Price, zestimate, rent estimate, beds, baths, sqft, year built, lot size, roof type, foundation, construction materials, HOA fees, tax history, price history, days on market, flood/fire/wind risk scores
  • Enrichment: Each property is automatically cross-referenced with the county appraiser and permit databases
  • Scale: 100+ fields extracted per property
  • Filtering: Auction exclusion, zipcode whitelists/blacklists, broker filters, days-on-market range, description keyword search

Job filters — city, price range, zip code lists, keywords, and more

Running crawler jobs with live log streaming

InvestorLift

Off-market wholesale deal marketplace. The challenge: InvestorLift doesn’t expose property addresses in their API.

InvestorLift integration dashboard

  • Address discovery: I reverse-engineered an alternative method to retrieve full street addresses without hitting InvestorLift’s restrictive daily limits. When the primary method doesn’t return results, the system falls back through a waterfall of secondary strategies — parsing addresses from listing descriptions and using the official API only as a last resort.
  • Cross-enrichment: Once an address is found, the system automatically looks up the property on Zillow to pull additional data (tax info, market days, zestimate) and enriches the listing further.
  • Data: Price, ARV estimate, gross margin, buy-now price, seller info, property details, photos, documents

County appraiser data (30+ Florida counties)

Every property that comes through the Zillow or InvestorLift pipeline gets enriched with county-level data — owner info, tax records, building details, and parcel data.

  • Coverage: All major Florida counties
  • Challenge: Every county runs a different system — no standard API or data format across any of them
  • Solution: Built a county adapter system that routes each property to the correct lookup based on location, handling all the variation transparently
  • Data: Owner info, tax records, building details, parcel data

Permits from multiple sources

Building permit history is critical for investment analysis — it shows what work has been done on a property and whether there are open permits.

  • Sources: County-specific permit systems, Accela portals, and municipal databases
  • Integration: Permit data is scraped per property during the enrichment phase and stored alongside the property record in the CRM
  • Data: Permit number, description, issue date, completion date, job value, status

AI enrichment

Properties are enriched using LLM APIs:

  • Crime grade estimation based on location data
  • Rent estimates for properties without existing rent data

Key challenges solved

Address discovery without API access

InvestorLift’s marketplace shows properties by location on a map but hides the actual street address behind a daily-limited “View Address” button. For an investment firm evaluating hundreds of listings, 10 address lookups per day is unusable.

I found an alternative method to resolve property coordinates to full street addresses — bypassing the daily limit entirely. The system uses a waterfall strategy: primary lookup first, then description parsing as a fallback, then the official API as a last resort. This turned address discovery from a bottleneck into a non-issue.

Bot detection on Zillow

Zillow blocks automated access aggressively. The crawler routes all requests through a proxy layer that handles IP rotation, browser fingerprinting, and session management transparently. The scraper code stays clean — the proxy layer handles the evasion.

30+ different county systems

Every Florida county runs a different appraiser/assessor website. Some have APIs, some are HTML-only, some require multi-step form submissions. I built a registry of county-specific adapters — when a property comes in, the system routes it to the correct adapter based on county name and runs the appropriate lookup. New counties can be added by writing a single adapter function.

Deduplication across sources

The same property can appear on both Zillow and InvestorLift, across multiple crawler runs, with different filters. The system tracks every scraped listing with composite keys (listing ID + publish date + job tag) and filters duplicates before making expensive API calls — not after.


Price monitoring

A separate always-on service that monitors saved properties for price changes.

Price monitoring rules — thresholds, intervals, and auto-actions

  • Sources: Checks both Zillow and InvestorLift prices on a schedule
  • Tracking: Current price, buy-now price, ARV estimate, zestimate, rent estimate, listing status
  • History: Every price check is recorded with timestamp, old price, new price, change amount, and percentage
  • Auto-rules: Configurable triggers — “When price drops below $X, send an alert” or “When status changes to sold, move to Sold stage in the CRM”
  • Scheduling: Runs on configurable intervals (minutes, hours, days)
  • Sleep hours: Crawlers and price monitors automatically pause during off-market hours (overnight) and resume when the market opens — no wasted resources or unnecessary requests when nothing is changing

The investment team gets notified of price changes as they happen, without manually checking each listing.


The CRM

The frontend is a full React application covering the complete investment lifecycle:

CRM leads view — properties flowing through the investment pipeline

Property management — grid view with filters, bulk actions, property detail pages with all enriched data from every source in one place.

Crawler control — visual job scheduler for both Zillow and InvestorLift crawlers with real-time log streaming. The team can configure filters, start/stop jobs, and monitor progress without touching the server.

Deal pipeline — funnel view tracking properties from discovery → pre-check → cold/hot → deal → pending purchase → under construction → sold. Full lifecycle management.

Flip analyzer — ROI calculator with purchase costs, holding costs, selling costs, and funding strategy. Helps the team evaluate deals before committing.

Document management — Google Drive integration with per-property folders. BoldSign API integration for electronic document signing — contracts, agreements, and closing documents are sent, signed, and tracked directly from the CRM without switching to a separate e-signature tool.

Finance & accounting — expense tracking, receipt management, payment reminders, contractor invoicing, and CPA handoff.

Contractor onboarding — registration, verification, and project assignment for renovation contractors.

Communications — call recordings via Twilio, team calendar with Google Calendar sync, email campaigns.


Results

The system runs 24/7 in production:

  • Multiple data sources feeding into a single CRM — Zillow, InvestorLift, 30+ county appraisers, permit systems, AI enrichment
  • 100+ fields per property — assembled automatically from multiple sources
  • 30+ Florida counties with dedicated appraiser/permit enrichment
  • Address discovery working reliably at scale without hitting daily API limits
  • Automated price monitoring with change detection and alert rules
  • Full deal lifecycle from discovery to sale, managed in one place
  • Real-time crawler control — the team runs and monitors scraping jobs from the CRM itself

The client went from manually checking listings across multiple websites and spreadsheets to having everything in one system — properties discovered, enriched, tracked, and managed automatically.

Need something similar built?

I build production scraping systems for teams that need reliable data at scale. Let's talk about your project.

Image preview