Este artigo aborda a sinergia entre a Inteligência Artificial (IA), especialmente a IA Generativa (GenIA), e a Gestão do Conhecimento (GC), ressaltando o conhecimento como principal fator de produção e ativo intangível, crucial para inovação e transformação digital. O objetivo é apresentar e analisar como a Petrobras vem enfrentando o desafio da aplicação da IA em seus processos de GC. O estudo utilizou a metodologia de estudo de caso, tendo a Petrobras como unidade de análise. Os procedimentos metodológicos incluíram a análise de relatórios internos, relatórios de scouting (pesquisa de mercado) e consultorias externas, exame da documentação da contratação de solução para iniciativa, discussões com membros do Time de GC e pesquisa bibliográfica. Os resultados indicam que, após experimentação interna e mapeamento internacional de soluções, foram definidos 42 requisitos de negócio em 13 dimensões para a solução de IA aderentes às necessidades da companhia. A viabilidade econômica desse tipo de iniciativa foi demonstrada pela projeção de economia de tempo na busca por informações (estimada em 30 minutos por dia por empregado). Adicionalmente, foi estabelecida uma governança pautada em IA Responsável. Em conclusão, o estudo evidencia um processo estruturado de transformação digital, reforçando que o êxito da IA na GC depende de uma governança robusta e da indispensável valorização do fator humano para mitigar riscos e maximizar o capital intelectual.
Diplomatics. Archives. Seals, Bibliography. Library science. Information resources
SEAL is a static analyser for the verification of programs that manipulate unbounded linked data structures. It is based on separation logic to represent abstract memory states and, unlike other separation-logic-based approaches, it employs a general-purpose separation logic solver Astral for satisfiability and entailment checking, which itself is based on translation to SMT. This design results in a modular architecture intended to be easier to extend and to combine with reasoning in other theories. Although still a prototype, SEAL achieved competitive results in the LinkedLists base category and was one of only four analysers capable of verifying programs with unbounded lists. We believe that the tool's extensibility, combined with further development, can lead to significant improvements in future competitions.
Aleksei Adadurov, Sergey Barseghyan, Anton Chtepine
et al.
We study optimal auction design for Maximum Extractable Value (MEV) auction markets on Ethereum. Using a dataset of 2.2 million transactions across three major orderflow providers, we establish three empirical regularities: extracted values follow a log-normal distribution with extreme right-tail concentration, competition intensity varies substantially across MEV types, and the standard Revenue Equivalence Theorem breaks down due to affiliation among searchers' valuations. We model this affiliation through a Gaussian common factor, deriving equilibrium bidding strategies and expected revenues for five auction formats, first-price sealed-bid, second-price sealed-bid, English, Dutch, and all-pay, across a fine grid of bidder counts $n$ and affiliation parameters $ρ$. Our simulations confirm the Milgrom-Weber linkage principle: English and second-price sealed-bid auctions strictly dominate Dutch and first-price sealed-bid formats for any $ρ> 0$, with a linkage gap of 14-28\% at moderate affiliation ($ρ=0.5$) and up to 30\% for small bidder counts. Applied to observed bribe totals, this gap corresponds to \$10-18 million in foregone revenue over the sample period. We also document a novel non-monotonicity: at large $n$ and high $ρ$, revenue peaks in the interior of the affiliation parameter space and declines thereafter, as near-perfect correlation collapses the order-statistic spread that drives competitive payments.
Historical archives contain qualitative descriptions of climate events, yet converting these into quantitative records has remained a fundamental challenge. Here we introduce a paradigm shift: a generative AI framework that inverts the logic of historical chroniclers by inferring the quantitative climate patterns associated with documented events. Applied to historical Chinese archives, it produces the sub-annual precipitation reconstruction for southeastern China over the period 1368-1911 AD. Our reconstruction not only quantifies iconic extremes like the Ming Dynasty's Great Drought but also, crucially, maps the full spatial and seasonal structure of El Ni$ñ$o influence on precipitation in this region over five centuries, revealing dynamics inaccessible in shorter modern records. Our methodology and high-resolution climate dataset are directly applicable to climate science and have broader implications for the historical and social sciences.
Large language models (LLMs) have achieved remarkable success across a wide range of natural language processing tasks, demonstrating human-level performance in text generation, reasoning, and question answering. However, training such models requires substantial computational resources, large curated datasets, and sophisticated alignment procedures. As a result, they constitute highly valuable intellectual property (IP) assets that warrant robust protection mechanisms. Existing IP protection approaches suffer from critical limitations. Model fingerprinting techniques can identify model architectures but fail to establish ownership of specific model instances. In contrast, traditional backdoor-based watermarking methods embed behavioral anomalies that can be easily removed through common post-processing operations such as fine-tuning or knowledge distillation. We propose SEAL, a subspace-anchored watermarking framework that embeds multi-bit signatures directly into the model's latent representational space, supporting both white-box and black-box verification scenarios. Our approach leverages model editing techniques to align the hidden representations of selected anchor samples with predefined orthogonal bit vectors. This alignment embeds the watermark while preserving the model's original factual predictions, rendering the watermark functionally harmless and stealthy. We conduct comprehensive experiments on multiple benchmark datasets and six prominent LLMs, comparing SEAL with 11 existing fingerprinting and watermarking methods to demonstrate its superior effectiveness, fidelity, efficiency, and robustness. Furthermore, we evaluate SEAL under potential knowledgeable attacks and show that it maintains strong verification performance even when adversaries possess knowledge of the watermarking mechanism and the embedded signatures.
Historical archives on weather events are collections of enduring primary source records that offer rich, untapped narratives of how societies have experienced and responded to extreme weather events. These qualitative accounts provide insights into societal vulnerability and resilience that are largely absent from meteorological records, making them valuable for climate scientists to understand societal responses. However, their vast scale, noisy digitized quality, and archaic language make it difficult to transform them into structured knowledge for climate research. To address this challenge, we introduce WeatherArchive-Bench, the first benchmark for evaluating retrieval-augmented generation (RAG) systems on historical weather archives. WeatherArchive-Bench comprises two tasks: WeatherArchive-Retrieval, which measures a system's ability to locate historically relevant passages from over one million archival news segments, and WeatherArchive-Assessment, which evaluates whether Large Language Models (LLMs) can classify societal vulnerability and resilience indicators from extreme weather narratives. Extensive experiments across sparse, dense, and re-ranking retrievers, as well as a diverse set of LLMs, reveal that dense retrievers often fail on historical terminology, while LLMs frequently misinterpret vulnerability and resilience concepts. These findings highlight key limitations in reasoning about complex societal indicators and provide insights for designing more robust climate-focused RAG systems from archival contexts. The constructed dataset and evaluation framework are publicly available at https://anonymous.4open.science/r/WeatherArchive-Bench/.
We introduce Radar DataTree, the first dataset-level framework that extends the WMO FM-301 standard from individual radar volume scans to time-resolved, analysis-ready archives. Weather radar data are among the most scientifically valuable yet structurally underutilized Earth observation datasets. Despite widespread public availability, radar archives remain fragmented, vendor-specific, and poorly aligned with FAIR (Findable, Accessible, Interoperable, Reusable) principles, hindering large-scale research, reproducibility, and cloud-native computation. Radar DataTree addresses these limitations with a scalable, open-source architecture that transforms operational radar archives into FAIR-compliant, cloud-optimized datasets. Built on the FM-301/CfRadial 2.1 standard and implemented using xarray DataTree, Radar DataTree organizes radar volume scans as hierarchical, metadata-rich structures and serializes them to Zarr for scalable analysis. Coupled with Icechunk for ACID-compliant storage and versioning, this architecture enables efficient, parallel computation across thousands of radar scans with minimal preprocessing. We demonstrate significant performance gains in case studies including Quasi-Vertical Profile (QVP) and precipitation accumulation workflows, and release all tools and datasets openly via the Raw2Zarr repository. This work contributes a reproducible and extensible foundation for radar data stewardship, high-performance geoscience, and AI-ready weather infrastructure.
Victoria L. Lemieux, Rosa Gil, Faith Molosiwa
et al.
As archives turn to artificial intelligence to manage growing volumes of digital records, privacy risks inherent in current AI data practices raise critical concerns about data sovereignty and ethical accountability. This paper explores how privacy-enhancing technologies (PETs) and Web3 architectures can support archives to preserve control over sensitive content while still being able to make it available for access by researchers. We present Clio-X, a decentralized, privacy-first Web3 digital solution designed to embed PETs into archival workflows and support AI-enabled reference and access. Drawing on a user evaluation of a medium-fidelity prototype, the study reveals both interest in the potential of the solution and significant barriers to adoption related to trust, system opacity, economic concerns, and governance. Using Rogers' Diffusion of Innovation theory, we analyze the sociotechnical dimensions of these barriers and propose a path forward centered on participatory design and decentralized governance through a Clio-X Decentralized Autonomous Organization. By integrating technical safeguards with community-based oversight, Clio-X offers a novel model to ethically deploy AI in cultural heritage contexts.
Celem artykułu jest zaprezentowanie relacji Tadeusza Smoleńskiego z Akademią Umiejętności, dzięki wsparciu której mógł poszerzać wiedzę o starożytnym Egipcie. T. Smoleński studiował historię i geografię na Uniwersytecie Jagiellońskim. W 1904 r. lekarze zdiagnozowali u niego chorobę płuc, co zmusiło go do przerwania nauki i podjęcia leczenia klimatycznego w Egipcie. W Kairze rozpoczął studia z archeologii i filologii egipskiej pod kierunkiem Gastona Maspero. Trudna sytuacja materialna w kraju faraonów skłoniła go do zwrócenia się o pomoc do Bolesława Ulanowskiego, sekretarza generalnego Akademii Umiejętności. W latach 1905/1906, 1906/1907, 1907/1908 T. Smoleński otrzymał od zarządu Akademii Umiejętności stypendium z legatu Malwiny Jankowskiej w wysokości 600 koron austriackich rocznie. W 1907 i 1908 r. brał udział w dwóch austro-węgierskich ekspedycjach wykopaliskowych w Szarunie i El-Gamhud. Dzięki uczestnictwu młodego egiptologa w kampaniach wykopaliskowych nad Nilem Akademia Umiejętności otrzymała cztery sarkofagi z mumiami z El-Gamhud oraz dwie wapienne płyty z ptolemejskiej świątyni w Szarunie.
Sealed-bid auctions play a crucial role in blockchain ecosystems. Previous works introduced viable blockchain sealed-bid auction protocols, leveraging timed commitments for bid encryption. However, a crucial challenge remains unresolved in these works: Who should bear the cost of decrypting these timed commitments? This work introduces a timed commitment outsourcing market as a solution to the aforementioned challenge. We first introduce an aggregation scheme for timed commitments, which combines all bidders' timed commitments into one while ensuring security and correctness and allowing a varying number of bidders. Next, we remodel the utility of auctioneers and timed commitment solvers, developing a new timed commitment competition mechanism and combining it with the sealed-bid auction to form a two-sided market. The protocol includes bid commitment collection, timed commitment solving, and payment. Through game-theoretical analysis, we prove that our protocol satisfies Dominant Strategy Incentive Compatibility (DSIC) for bidders, Bayesian Incentive Compatibility (BIC) for solvers, and achieves optimal revenue for the auctioneer among a large class of mechanisms. Finally, we prove that no mechanism can achieve positive expected revenue for the auctioneer while satisfying DSIC and Individual Rationality (IR) for both bidders and solvers.
Current metadata creation for web archives is time consuming and costly due to reliance on human effort. This paper explores the use of gpt-4o for metadata generation within the Web Archive Singapore, focusing on scalability, efficiency, and cost effectiveness. We processed 112 Web ARChive (WARC) files using data reduction techniques, achieving a notable 99.9% reduction in metadata generation costs. By prompt engineering, we generated titles and abstracts, which were evaluated both intrinsically using Levenshtein Distance and BERTScore, and extrinsically with human cataloguers using McNemar's test. Results indicate that while our method offers significant cost savings and efficiency gains, human curated metadata maintains an edge in quality. The study identifies key challenges including content inaccuracies, hallucinations, and translation issues, suggesting that Large Language Models (LLMs) should serve as complements rather than replacements for human cataloguers. Future work will focus on refining prompts, improving content filtering, and addressing privacy concerns through experimentation with smaller models. This research advances the integration of LLMs in web archiving, offering valuable insights into their current capabilities and outlining directions for future enhancements. The code is available at https://github.com/masamune-prog/warc2summary for further development and use by institutions facing similar challenges.
Existing auto-regressive language models have demonstrated a remarkable capability to perform a new task with just a few examples in prompt, without requiring any additional training. In order to extend this capability to a multi-modal setting (i.e. speech and language), this paper introduces the Seal model, an abbreviation for speech language model. It incorporates a novel alignment method, in which Kullback-Leibler divergence loss is performed to train a projector that bridges a frozen speech encoder with a frozen language model decoder. The resulting Seal model exhibits robust performance as a few-shot learner on two speech understanding tasks. Additionally, consistency experiments are conducted to validate its robustness on different pre-trained language models.
The Artificial intelligence in critical sectors-healthcare, finance, and public safety-has made system integrity paramount for maintaining societal trust. Current verification methods for AI systems lack comprehensive lifecycle assurance, creating significant vulnerabilities in deployment of both powerful and trustworthy AI. This research introduces Meta-Sealing, a cryptographic framework that fundamentally changes integrity verification in AI systems throughout their operational lifetime. Meta-Sealing surpasses traditional integrity protocols through its implementation of cryptographic seal chains, establishing verifiable, immutable records for all system decisions and transformations. The framework combines advanced cryptography with distributed verification, delivering tamper-evident guarantees that achieve both mathematical rigor and computational efficiency. Our implementation addresses urgent regulatory requirements for AI system transparency and auditability. The framework integrates with current AI governance standards, specifically the EU's AI Act and FDA's healthcare AI guidelines, enabling organizations to maintain operational efficiency while meeting compliance requirements. Testing on financial institution data demonstrated Meta-Sealing's capability to reduce audit timeframes by 62% while enhancing stakeholder confidence by 47%. Results can establish a new benchmark for integrity assurance in enterprise AI deployments. This research presents Meta-Sealing not merely as a technical solution, but as a foundational framework ensuring AI system integrity aligns with human values and regulatory requirements. As AI continues to influence critical decisions, provides the necessary bridge between technological advancement and verifiable trust. Meta-Sealing serves as a guardian of trust, ensuring that the AI systems we depend on are as reliable and transparent as they are powerful.
O papel é considerado um elemento material da cultura e deve ser estudado em sua materialidade e historicidade. Esta pesquisa buscou identificar os fabricantes dos papéis utilizados pela administração pública na capitania de Minas Gerais, compreendendo a proveniência do papel a partir de análises materiais e históricas em uma perspectiva interdisciplinar, destacando os papéis italianos. A caracterização da documentação visa contribuir para o estudo do arquivo como objeto material da cultura.
Palavras-chave: papel de trapo; marcas d’água; coleção Casa dos Contos; século XVIII.
Diplomatics. Archives. Seals, Bibliography. Library science. Information resources