How was this not obvious from the start? If it's generating content, then the generated content can deviate from the original data. If I want accurate results, then I copy the data without generating new content.