How the data is built
Every plant record in Houseplantdex is built from three sources, each handled with different rules. This page explains them.
Taxonomy — GBIF
The scientific name, authorship, family, genus, and stable ID
(gbif_usage_key) come from
GBIF's species API. GBIF resolves taxonomic
synonyms automatically — when you ask for Sansevieria trifasciata
(the old snake-plant name), it returns Dracaena trifasciata (the
currently accepted one). We always store the accepted name and record
the resolution path.
Toxicity — ASPCA
Whether a plant is toxic to cats, dogs, and horses comes from the ASPCA's toxic and non-toxic plant lists. Two rules we do not break:
- We never guess toxicity. If a plant isn't on ASPCA's list, its toxicity is marked unverified — even if popular wisdom says otherwise. (The ZZ plant is the canonical example.)
- We join on scientific name, not common name. "Snake plant" can mean a dozen unrelated species; only the binomial is reliable.
Each toxicity record notes whether the ASPCA match was at species or genus level, and whether it was made via a taxonomic synonym.
Care attributes — generated, then verified
No single open source publishes structured care data for every houseplant. We generate light, water, humidity, temperature, soil, fertilizer, problems, and propagation values, then verify each one against two independent reputable references — preferring the Missouri Botanical Garden Plant Finder, the Royal Horticultural Society, and university extension services (NC State, Iowa State, etc.).
For each field, the verification record notes the value seen at each source and the agreement status. Where sources genuinely disagree, the field is marked disputed rather than picking a winner.
What we won't claim
- That a plant is non-toxic when ASPCA doesn't say so.
- That a plant is toxic just because a sibling species is.
- A precise watering frequency (every reference qualifies it with conditions).
- An indoor height for a plant we haven't independently checked.
Updates
The dataset is versioned in a Git repository. Each plant record carries
a provenance block with fetch and verification dates so
anything stale is visible.