William Silva, Clarissa Moreira dos Santos Schmidt
Este artigo tem como enfoque identificar os regimes de informação existentes na elaboração e implantação das Estratégias do Governo Digital (EGD) que condicionam a ausência de representação da preservação digital. Para tanto, valemo-nos de abordagem que contempla o modelo qualiquantitativo, sendo esta uma pesquisa aplicada e com objetivo exploratório. Não se identificam, nas EGD, elementos das diretrizes sobre preservação digital emanadas pelo Arquivo Nacional e pelo Conselho Nacional de Arquivos.
Palavras-chave: regime de informação; preservação digital; estratégia do governo digital; arquivologia.
Diplomatics. Archives. Seals, Bibliography. Library science. Information resources
Ao reconhecer a instabilidade da análise de citação como o pilar do paradigma métrico clássico, este ensaio teórico-metodológico propõe o quadro metodológico PLACECC, uma expansão e operacionalização do diagnóstico de John Ziman sobre a ciência pós-acadêmica. Essa metrologia zimaniana supera a lógica puramente avaliativa dos indicadores para diagnosticar a cultura científica contemporânea. A proposta adiciona as novas dimensões, competitivo e comunal, ao acrônimo PLACE original, associando as sete dimensões resultantes a indicadores bibliométricos e altmétricos. Com isso, o quadro ressignifica as métricas como sintomas da tensão central da ciência atual: a disputa entre as forças de mercado e os valores da Ciência Aberta. Conclui-se que o PLACECC oferece uma matriz de diagnóstico cultural robusta, capaz de mapear empiricamente essas disputas que definem a ciência real e de deslocar o foco da mera contagem para a compreensão sociológica.
Diplomatics. Archives. Seals, Bibliography. Library science. Information resources
Retrieval-Augmented Generation (RAG) systems introduce a critical vulnerability: contextual leakage, where adversaries exploit instruction-following to exfiltrate Personally Identifiable Information (PII) via adaptive extraction. Current defenses force a rigid trade-off between semantic utility and latency. We present SEAL-Tag, a privacy-preserving runtime environment that resolves this via a Verify-then-Route paradigm. SEAL-Tag introduces the SEAL-Probe protocol, transforming auditing into a structured tool-use operation where the model generates a verifiable PII-Evidence Table (PET) alongside its draft. To adjudicate this evidence, we employ a Probabilistic Circuit (PC) that enforces verifiable logical constraints for robust decision-making. To overcome the privacy "Cold Start" problem, we introduce the S0--S6 Anchored Synthesis Pipeline, generating high-fidelity, provenanced RAG interactions. We pair this with a Two-Stage Curriculum that first optimizes for entity detection before aligning the model to the rigorous audit protocol. Our evaluation demonstrates that SEAL-Tag establishes a new Pareto frontier, reducing adaptive leakage by over 8$\times$ while matching the utility and speed of unsafe baselines.
Luiza Gutheil Bayer, Heloísa Helena Fernandes Gonçalves da Costa
Esta pesquisa buscou compreender como o Museu Fernando Ferrari pode ser vetor de difusão da história de São Pedro do Sul (RS) e cumprir com a função social de fortalecimento da autoestima e da identidade local. A reorganização do museu (2018-2019) colaborou para ações socioculturais que possibilitaram resultados para a difusão da história local.
Palavras-chave: museu; patrimônio cultural; identidade; Museu Histórico Fernando Ferrari.
Diplomatics. Archives. Seals, Bibliography. Library science. Information resources
O presente trabalho tem como objetivo definir a temática “Competência em Informação na Educação Profissional e Tecnológica” como tema de pesquisa emergente nos últimos anos. Para isso, apresenta um panorama da Educação Profissional e Tecnológica no Brasil a partir de 2004 e coteja, por meio da literatura, as relações entre Competência em Informação, Trabalho e Educação Profissional e Tecnológica na literatura internacional. Visando alcançar o objetivo traçado, desenvolve uma pesquisa exploratória, de abordagem quali-quantitativa, que tem como método de coleta de dados a pesquisa bibliográfica em bases de dados como Oasisbr, BDTD, Brapci e Google Scholar. Os 26 documentos recuperados, sua dispersão temporal e variedade tipológica e de conteúdo, apontam para a importância que vem se dando de forma contínua desde 2015 no Brasil à necessidade de se compreender como se deve dar a educação para a Competência em Informação no contexto da Educação Profissional e Tecnológica.
Diplomatics. Archives. Seals, Bibliography. Library science. Information resources
This paper introduces AI Blob!, an experimental system designed to explore the potential of semantic cataloging and Large Language Models (LLMs) for the retrieval and recontextualization of archival television footage. Drawing methodological inspiration from Italian television programs such as Blob (RAI Tre, 1989-), AI Blob! integrates automatic speech recognition (ASR), semantic embeddings, and retrieval-augmented generation (RAG) to organize and reinterpret archival content. The system processes a curated dataset of 1,547 Italian television videos by transcribing audio, segmenting it into sentence-level units, and embedding these segments into a vector database for semantic querying. Upon user input of a thematic prompt, the LLM generates a range of linguistically and conceptually related queries, guiding the retrieval and recombination of audiovisual fragments. These fragments are algorithmically selected and structured into narrative sequences producing montages that emulate editorial practices of ironic juxtaposition and thematic coherence. By foregrounding dynamic, content-aware retrieval over static metadata schemas, AI Blob! demonstrates how semantic technologies can facilitate new approaches to archival engagement, enabling novel forms of automated narrative construction and cultural analysis. The project contributes to ongoing debates in media historiography and AI-driven archival research, offering both a conceptual framework and a publicly available dataset to support further interdisciplinary experimentation.
Retrieval-Augmented Generation (RAG) has emerged as a promising approach for knowledge-intensive tasks. However, few studies have examined RAG for Taiwanese Historical Archives. In this paper, we present an initial study of a RAG pipeline applied to two historical Traditional Chinese datasets, Fort Zeelandia and the Taiwan Provincial Council Gazette, along with their corresponding open-ended query sets. We systematically investigate the effects of query characteristics and metadata integration strategies on retrieval quality, answer generation, and the performance of the overall system. The results show that early-stage metadata integration enhances both retrieval and answer accuracy while also revealing persistent challenges for RAG systems, including hallucinations during generation and difficulties in handling temporal or multi-hop historical queries.
Philippe Colantoni, Rafique Ahmed, Prashant Ghimire
et al.
The accuracy and efficiency of human body pose estimation depend on the quality of the data to be processed and of the particularities of these data. To demonstrate how dance videos can challenge pose estimation techniques, we proposed a new 3D human body pose estimation pipeline which combined up-to-date techniques and methods that had not been yet used in dance analysis. Second, we performed tests and extensive experimentations from dance video archives, and used visual analytic tools to evaluate the impact of several data parameters on human body pose. Our results are publicly available for research at https://www.couleur.org/articles/arXiv-1-2025/
Amirhossein Fardi, Hamayun Farooq, Imran Akhtar
et al.
In this paper, we investigate the hydrodynamic characteristics of harbor seal locomotion, focusing on the role of hind flippers in thrust generation and wake dynamics. Through three-dimensional numerical simulations using an immersed boundary method at Reynolds number of 3000, we analyze the impact of varying Strouhal number (St = 0.2-0.35) and propulsive wavelength ($λ^\ast = 1.0-1.2$) on swimming performance. Our findings reveal two distinct wake patterns: a single-row structure at lower Strouhal numbers ($St \leq 0.25$) and a double-row configuration at higher St ($St \geq 0.3$). Increasing wavelength generally enhances thrust production by reducing both pressure and friction of drag components. Additionally, we identify critical vortex interactions between the front and hind flippers, with destructive interference occurring at lower St and constructive patterns emerging at higher St. Circulation analysis confirms stronger vortex formation at higher St and $λ^\ast$}, particularly during the left stroke phase. These results provide novel insights into the hydrodynamic mechanisms underlying seal locomotion and contribute to our understanding of efficient aquatic propulsion systems.
Este artigo propõe uma interpretação do monumento que marcou a inauguração do primeiro trecho construído da rodovia Transamazônica. Problematiza as relações entre a ditadura e os empresários, e os planos da ditadura para a Amazônia, demonstrando que esse relacionamento se configurou como uma economia simbólica baseada na dádiva e na contradádiva. O monumento, entendido como um semióforo, explicita certas memórias do capitalismo.
Palavras-chave: ditadura civil-militar brasileira; memória; transamazônica; memória do capitalismo.
Diplomatics. Archives. Seals, Bibliography. Library science. Information resources
Brenner Lopes, Ricardo Rodrigues Barbosa, Luander Cipriano de Jesus Falcão
et al.
Vivemos num ambiente caracterizado como um oceano de dados, que cresce não só quanto ao seu volume e quantidade, mas também em termos de variedade, sendo criado e transitando em alta velocidade. Atualmente os dados estruturados estão em quantidade e importância bem menor, e os ajustes e aprimoramentos nas tecnologias e modelos analíticos foram em parte realizados para se adaptarem a essa nova realidade, que convencionou-se chamar de Big Data Analytics. Entre as questões de grande preocupação, nessa nova realidade, estão as ameaças à privacidade. A questão posta como resultado de diversas pesquisas é que os procedimentos, técnicas, tecnologias e legislações, atualmente disponíveis, não conseguem dar garantia plena à privacidade. Diante desse complexo cenário, o objetivo dessa pesquisa foi propor um modelo teórico multifacetado no âmbito do Big Data Analytics, que garanta a privacidade, ao mesmo tempo em que não inviabilize sua extração de valor. A metodologia proposta para esse trabalho foi a revisão sistemática da literatura, com vistas à análise crítica dos apontamentos e conclusões de estudos anteriores, a identificação e proposição lógica de novas hipóteses e construtos, de maneira a formatar o desenho final de um modelo teórico. Como resultado é proposto o Pentágono da Privacidade no Big Data Analytics, que contempla um caleidoscópio de soluções capazes de garantir a privacidade ao mesmo tempo que dá garantias à extração de valor no Big Data Analytics. O construto obtido como resultado desse trabalho, traz uma resposta concisa e consistente à questão de partida desse trabalho.
Diplomatics. Archives. Seals, Bibliography. Library science. Information resources
The phase-out of hydro-fluorocarbons, owing to their high Global Warming Power, affecting the main gas used in Resistive Plate Chambers (RPCs), tetrafluoroethane C$_2$H$_2$F$_4$, has increased operational difficulties on existing systems and imposes strong restrictions on its use in new systems. This has motivated a new line of R\&D on sealed RPCs: RPCs that do not require a continuous gas flow for their operation and dispense the use of very complex and expensive re-circulation and/or recycling gas systems. At the moment it is not clear whether this solution can cover all fields of application normally allocated to RPCs, but it seems that it could be considered as a valid option for low particle flux triggering/tracking of particles, e.g. in cosmic ray or rare event experiments. In this work, we demonstrate the feasibility of a small telescope for atmospheric muon tracking consisting of four $300$~x~$300$~mm$^2$ sealed RPCs with gas gap widths of $1$~mm, $1.5$~mm and $2$~mm. The results suggest that it is possible to operate this type of detectors for extended periods of time (more than five months) with its main characteristics, efficiency, average charge and streamer probability, without apparent degradation and similar to a RPC operated in continuous gas flow.
Max Brodheim, John O'Meara, Jeffrey A. Mader
et al.
The W. M. Keck Observatory is welcoming a new era where data reduction and archiving are tightly integrated into our observing model, under the auspices of the Observatory's Data Services Initiative (DSI) project. While previously the Keck Observatory Archive (KOA) archived minimally processed, raw science data the day after observing, Keck is transitioning to a model in which it archives both raw frames and reduced data in near real-time. These data will be made available to observers and collaborators immediately upon ingestion through a dedicated new interface that will support collaboration and sharing among teams, as well as stream data directly to personal computers without access to WMKO's internal networks. Both the raw and science-ready data products will be made publicly available upon the expiration of data protections.
Pierre Fernandez, Hady Elsahar, I. Zeki Yalniz
et al.
The proliferation of AI-generated content and sophisticated video editing tools has made it both important and challenging to moderate digital platforms. Video watermarking addresses these challenges by embedding imperceptible signals into videos, allowing for identification. However, the rare open tools and methods often fall short on efficiency, robustness, and flexibility. To reduce these gaps, this paper introduces Video Seal, a comprehensive framework for neural video watermarking and a competitive open-sourced model. Our approach jointly trains an embedder and an extractor, while ensuring the watermark robustness by applying transformations in-between, e.g., video codecs. This training is multistage and includes image pre-training, hybrid post-training and extractor fine-tuning. We also introduce temporal watermark propagation, a technique to convert any image watermarking model to an efficient video watermarking model without the need to watermark every high-resolution frame. We present experimental results demonstrating the effectiveness of the approach in terms of speed, imperceptibility, and robustness. Video Seal achieves higher robustness compared to strong baselines especially under challenging distortions combining geometric transformations and video compression. Additionally, we provide new insights such as the impact of video compression during training, and how to compare methods operating on different payloads. Contributions in this work - including the codebase, models, and a public demo - are open-sourced under permissive licenses to foster further research and development in the field.
Jan Heinrich Reimer, Sebastian Schmidt, Maik Fröbe
et al.
The Archive Query Log (AQL) is a previously unused, comprehensive query log collected at the Internet Archive over the last 25 years. Its first version includes 356 million queries, 166 million search result pages, and 1.7 billion search results across 550 search providers. Although many query logs have been studied in the literature, the search providers that own them generally do not publish their logs to protect user privacy and vital business data. Of the few query logs publicly available, none combines size, scope, and diversity. The AQL is the first to do so, enabling research on new retrieval models and (diachronic) search engine analyses. Provided in a privacy-preserving manner, it promotes open research as well as more transparency and accountability in the search industry.
Cameron Kisailus, Daksh Narang, Matthew Shannon
et al.
Recent advances in robotic mobile manipulation have spurred the expansion of the operating environment for robots from constrained workspaces to large-scale, human environments. In order to effectively complete tasks in these spaces, robots must be able to perceive, reason, and execute over a diversity of affordances, well beyond simple pick-and-place. We posit the notion of semantic frames provides a compelling representation for robot actions that is amenable to action-focused perception, task-level reasoning, action-level execution, and integration with language. Semantic frames, a product of the linguistics community, define the necessary elements, pre- and post- conditions, and a set of sequential robot actions necessary to successfully execute an action evoked by a verb phrase. In this work, we extend the semantic frame representation for robot manipulation actions and introduce the problem of Semantic Frame Execution And Localization for Perceiving Afforded Robot Actions (SEAL) as a graphical model. For the SEAL problem, we describe our nonparametric Semantic Frame Mapping (SeFM) algorithm for maintaining belief over a finite set of semantic frames as the locations of actions afforded to the robot. We show that language models such as GPT-3 are insufficient to address generalized task execution covered by the SEAL formulation and SeFM provides robots with efficient search strategies and long term memory needed when operating in building-scale environments.