LLM Dataset curation