Simulated Hospital — setup, integration & notes
The activity plane is Google’s Simulated Hospital (“Simhospital”): a synthetic HL7v2 generator that behaves like a hospital that is always open. It is what makes the EMR a populated, moving system rather than an empty shell. This page is the deep dive — the setup choices and the things that cost real time to get right.
Built from source
Section titled “Built from source”The official eu.gcr.io/simhospital-images/simhospital image is no longer
anonymously pullable, and upstream ships no Dockerfile (it builds via Bazel)
and no go.mod. So gh-simhospital/build/ synthesises a Go-modules build:
- A pinned upstream commit, fetched in a multi-stage
Dockerfile. - A committed
go.mod/go.sum(generated once withgo mod tidy) so the build is reproducible, not re-resolved each time. GOTOOLCHAIN=auto, because a transitive dependency now requires a newer Go than the base image ships.
Theming — doctors and wards
Section titled “Theming — doctors and wards”Simhospital takes your own doctors and locations:
config/doctors.yml— ordering/attending physicians. These are the lab’s cast, the same names that exist as AD accounts, so every message names someone you can look up in the directory.config/locations.yml— the wards (ED, ICU, Nephrology, Cardiology, Maternity, Pediatrics, Oncology). Pathways reference these by key; thepoccode populates the HL7 PV1 assigned-location.
US locale & financial realism
Section titled “US locale & financial realism”Out of the box Simhospital generates UK patients (NHS numbers, London
addresses, GBR) — which jars against a US cast admitting to “General Hospital.”
A locale pack (config/locales/us/data.yml, wired in via -data_config_file)
plus a set of source patches (build/patches/) make the patients coherent:
- US demographics — names and Chicago-area addresses (Evanston, Oak Park,
Cicero, Naperville…),
country = USA, 5-digit ZIPs. - National ID — a US SSN in a
SS-typed PID-3 repetition (not an NHS number), and on the FHIR Patient. - Insurance — an
IN1segment on admit (Medicare / commercial), carried across the financial-lifecycle ADT messages. - Guarantor — a
GT1segment; for a minor the guarantor is a parent. - Cast cameo — occasionally a patient is named after a recognizable
character from the medical-show universe (show-weighted, low probability), a
wink for demos. The attending clinicians still come from
doctors.yml.
The interface engine
maps these through to OpenEMR: the SSN and address onto patient_data, IN1
into insurance_data, and the GT1 guarantor as the policy subscriber (so a
pediatric patient shows a parent as the policyholder).
Pathways — what segments the traffic carries
Section titled “Pathways — what segments the traffic carries”A pathway is a YAML patient journey. The live distribution spans 24 active
pathways across three files, with percentage_of_patients summing to exactly
100 so the distribution manager runs them all:
gh_pathways.yml— the original six (ED chest pain, AKI dialysis, surgical ICU, peds fever, maternity, onc infusion), now rebalanced to 38%.gh_clinical_pathways.yml— 13 service-line journeys (49%): NSTEMI, CKD exacerbation, neutropenic fever, upper-GI bleed, COPD, ischemic stroke, hip fracture, urosepsis, DKA, preeclampsia, bronchiolitis, suicidal-ideation hold, anaphylaxis.gh_fax_pathways.yml— 5 fax-workflow journeys (13%); see below.
Two more files stay at 0% (run on demand, not in the live mix): the
gh_negative_pathways.yml malformed-message set and gh_fhir_demo.yml.
The enrichment that makes each chart rich is all pathway-driven, no code:
| Want this segment | Add this to the pathway |
|---|---|
AL1 (allergy) | an allergies: list on an admission/registration step → rides on the ADT |
DG1 (diagnosis) | a diagnoses: list on an update_person step → emits an ADT^A08 |
PR1 (procedure) | a procedures: list on an update_person step → emits an ADT^A08 |
OBX lab trends | historical_data (backdated results) + result steps |
Fax-workflow pathways
Section titled “Fax-workflow pathways”gh_fax_pathways.yml models the clinical episodes that, in a real hospital,
generate fax traffic — useful for exercising the Print & Fax
emulation (faxart). Simhospital emits HL7, not
faxes, so each pathway models the episode and the fax artifact it represents.
Notably, the clinical_note step in this build lands in OpenEMR as a
clearly-named procedure order, so the fax artifact is visible in the chart:
| Pathway | Episode | Fax artifact (shows in OpenEMR as) |
|---|---|---|
fax_referral_cardiology | registration + AFib diagnosis | Referral Letter |
fax_discharge_summary | admit → pneumonia → discharge | (discharge summary to PCP) |
fax_results_to_referrer | order + results | Laboratory Report |
fax_prior_auth | OA diagnosis + knee replacement | Prior Authorization Request |
fax_pharmacy_script | bronchitis + prescription | Prescription |
So a faxart demo can point at a recognizable, named order on a patient that arrived as synthetic HL7 traffic, rather than a hand-made document.
Gotcha 1 — let Simhospital compute lab values
Section titled “Gotcha 1 — let Simhospital compute lab values”Simhospital does semantic validation on results: a specified abnormal_flag
must match what the value-versus-reference-range computes. Hand-picking a value
and a flag that disagree (or omitting the flag on an out-of-range value) is a
fatal error that crash-loops the container. The YAML lints fine; only a live
run catches it.
The robust fix, used here: drop explicit results: value lists and keep just the
order_profile. Simhospital then generates in-spec values and derives the
abnormal flags itself — still realistic, always valid.
Gotcha 2 — delays are real-elapsed time
Section titled “Gotcha 2 — delays are real-elapsed time”Pathway delay steps are wall-clock, not simulated-fast. A transfer scheduled
“2h later” actually fires two hours later. The practical consequences:
- A short burst only produces the early events (admits, and the allergies that ride on them). Diagnoses (A08) and discharges (A03) arrive much later.
- To verify the late events quickly, send crafted MLLP messages directly, or use a no-delay pathway in deterministic mode.
Output and rate
Section titled “Output and rate”The compose command sends HL7 to the engine over MLLP:
- -output=${SIM_OUTPUT:-mllp}- -mllp_destination=${MLLP_DEST:-oie:6661}- -pathways_per_hour=${PATHWAYS_PER_HOUR:-120}SIM_OUTPUT=stdout falls back to logging on the bus-free path. To seed a
batch, raise PATHWAYS_PER_HOUR (e.g. 2500), make restart, let it run, then
dial back to ~120 for a living trickle. The MLLP sender blocks on the ACK, so the
engine’s auto-ACK must be working or throughput collapses — see the
gh-integration notes on the mandatory responseGenerationProperties.
FHIR output (experimental)
Section titled “FHIR output (experimental)”Simhospital can emit FHIR R4 alongside HL7. Two things were essential to learn:
- Resource generation is step-triggered. A pathway must end with a
generate_resources: {}step or the FHIR writer is never invoked (it silently logswritten: 0). - Observations break the marshaller. With a
result/lab step, thegoogle/fhirJSON marshaller errors on the Observation’s Quantity value (JSONRawValue: invalid character ...). Upstream pinsgoogle/fhir@a54aa66(~2020) via Bazel; the from-source build resolves the modernv0.7.4, whose Quantity marshalling differs. Pinning the old library would drag in incompatible 2020-era protobuf APIs and break the build.
So make fhir-sample runs a no-delay fhir_demo pathway in deterministic mode
without lab steps, and produces valid bundles of Patient, Encounter,
AllergyIntolerance, Condition, Location, and Practitioner — everything
except Observation.
Negative testing and deterministic mode
Section titled “Negative testing and deterministic mode”- The built image bundles upstream’s malformed messages (
InvalidNhsNum,InvalidOru_MissingPlacerAndFiller). A pathway emits one with ahardcoded_messagestep that selects by regex (there is noname:field).config/pathways/gh_negative_pathways.ymldefines these at 0% so they never fire in the live mix — fire them on demand to exercise the engine’s error handling. - For repeatable demos, switch the manager with
-pathway_manager_type=deterministic -pathway_names=<comma-list>so the named pathways run in a fixed order instead of the weighted distribution.
Where it fits
Section titled “Where it fits”Simhospital feeds the integration plane, which writes into OpenEMR. To bring it up with the rest, see Run the clinical ecosystem.