Job descriptionBenchmark dataset project evaluating AI models on visual document understanding and instruction-following in the Real Estate Appraisal domain. Experts author complex, grounded tasks with a clear ground-truth output and objective rubric. ~15–20 hrs/week, remote, US/Canada.