Deduplicator
Deduplicatoris the class for eliminating duplicate instruction samples that could adversely affect both pre-training stability and the performance of LLMs.Deduplicatorcan also enables efficient use and optimization of storage space.
Last updated