Brazil’s Unified Health System (SUS) is a continental-scale network that serves more than 180 million people per year, where interoperability is essential. This is a challenge for unstructured clinical notes This study thus intends to apply large language models (LLMs) to the anonymization and structuring of clinical notes, considering statistical performance, language, clinical context, and patient diversity, along with their assessment as a representative set of data for the SUS. The study seeks to promote their interoperability and establish a gold standard for future assessment of LLMs. The proposal is targeted to health administrators, who will be able to obtain structured and interoperable information for better decision-making, to researchers and professionals who will benefit from accessible data to develop studies and more adequate tools for the needs of the SUS, and to patients, who will be the indirectly benefited users. The project has the potential to promote improvements in clinical practice, data privacy, patient care, research, and health policies in the SUS and will pave the way for other applications of artificial intelligence in health, proposing synergy with high technology, to the extent that it will increase the capacity for data integration and acquisition, establishing representative mechanisms for assessing the reality in the SUS.