<p><span id="page62"/>Chew Bahir, a lake that is dry for most of the year, located in a tectonic basin in the southern Ethiopian Rift, was the target of several scientific drilling expeditions between 2009 and 2014. The aim of these expeditions was to explore the basin and its lake sediments as an archive of past changes in the environmental conditions during the evolution of our species, <i>Homo sapiens</i>. In more than 25 publications, the scientific findings derived from the analysis of the sediments were presented and discussed in detail. In the present paper, we provide the background information on the project's origins, planning and implementation – that is, information that has not yet been presented in scientific papers, or only very briefly, but which could be important for those working on similar projects in the future. Herein, we particularly focus on the advantages and disadvantages of obtaining twin cores at a short distance, aiming at a continuous high-quality composite core, a strategy that had to be defended during the planning stage of the project due to the higher costs involved but which is considered to be the best practice for scientific drilling in modern sedimentary basins.</p>
Paleoradiology, the use of modern imaging technologies to study archaeological and anthropological remains, offers new windows on millennial scale patterns of human health. Unfortunately, the radiographs collected during field campaigns are heterogeneous: bones are disarticulated, positioning is ad hoc, and laterality markers are often absent. Additionally, factors such as age at death, age of bone, sex, and imaging equipment introduce high variability. Thus, content navigation, such as identifying a subset of images with a specific projection view, can be time consuming and difficult, making efficient triaging a bottleneck for expert analysis. We report a zero shot prompting strategy that leverages a state of the art Large Vision Language Model (LVLM) to automatically identify the main bone, projection view, and laterality in such images. Our pipeline converts raw DICOM files to bone windowed PNGs, submits them to the LVLM with a carefully engineered prompt, and receives structured JSON outputs, which are extracted and formatted onto a spreadsheet in preparation for validation. On a random sample of 100 images reviewed by an expert board certified paleoradiologist, the system achieved 92% main bone accuracy, 80% projection view accuracy, and 100% laterality accuracy, with low or medium confidence flags for ambiguous cases. These results suggest that LVLMs can substantially accelerate code word development for large paleoradiology datasets, allowing for efficient content navigation in future anthropology workflows.
Ancient manuscripts are the primary source of ancient linguistic corpora. However, many ancient manuscripts exhibit duplications due to unintentional repeated publication or deliberate forgery. The Dead Sea Scrolls, for example, include counterfeit fragments, whereas Oracle Bones (OB) contain both republished materials and fabricated specimens. Identifying ancient manuscript duplicates is of great significance for both archaeological curation and ancient history study. In this work, we design a progressive OB duplicate discovery framework that combines unsupervised low-level keypoints matching with high-level text-centric content-based matching to refine and rank the candidate OB duplicates with semantic awareness and interpretability. We compare our model with state-of-the-art content-based image retrieval and image matching methods, showing that our model yields comparable recall performance and the highest simplified mean reciprocal rank scores for both Top-5 and Top-15 retrieval results, and with significantly accelerated computation efficiency. We have discovered over 60 pairs of new OB duplicates in real-world deployment, which were missed by domain experts for decades. Code, model and real-world results are available at: https://github.com/cszhangLMU/OBD-Finder/.
Deciphering oracle bone characters (OBCs), the oldest attested form of written Chinese, has remained the ultimate, unwavering goal of scholars, offering an irreplaceable key to understanding humanity's early modes of production. Current decipherment methodologies of OBC are primarily constrained by the sporadic nature of archaeological excavations and the limited corpus of inscriptions. With the powerful visual perception capability of large multimodal models (LMMs), the potential of using LMMs for visually deciphering OBCs has increased. In this paper, we introduce PictOBI-20k, a dataset designed to evaluate LMMs on the visual decipherment tasks of pictographic OBCs. It includes 20k meticulously collected OBC and real object images, forming over 15k multi-choice questions. We also conduct subjective annotations to investigate the consistency of the reference point between humans and LMMs in visual reasoning. Experiments indicate that general LMMs possess preliminary visual decipherment skills, and LMMs are not effectively using visual information, while most of the time they are limited by language priors. We hope that our dataset can facilitate the evaluation and optimization of visual attention in future OBC-oriented LMMs. The code and dataset will be available at https://github.com/OBI-Future/PictOBI-20k.
We study recovering a 1D order from a noisy, locally sampled pairwise comparison matrix under a tight query budget. We recast the task as reconstructing a sparse, noisy line graph and present, to our knowledge, the first method that provably builds a sparse graph containing all edges needed for exact seriation using only O(N(log N + K)) oracle queries, which is near-linear in N for fixed window K. The approach is parallelizable and supports both binary and bounded-noise distance oracles. Our five-stage pipeline consists of: (i) a random-hook Boruvka step to connect components via short-range edges in O(N log N) queries; (ii) iterative condensation to bound graph diameter; (iii) a double-sweep BFS to obtain a provisional global order; (iv) fixed-window densification around that order; and (v) a greedy SuperChain that assembles the final permutation. Under a simple top-1 margin and bounded relative noise we prove exact recovery; empirically, SuperChain still succeeds when only about 2N/3 of true adjacencies are present. On wafer-scale serial-section EM, our method outperforms spectral, MST, and TSP baselines with far fewer comparisons, and is applicable to other locally structured sequencing tasks such as temporal snapshot ordering, archaeological seriation, and playlist/tour construction.
Constructing historical language models (LMs) plays a crucial role in aiding archaeological provenance studies and understanding ancient cultures. However, existing resources present major challenges for training effective LMs on historical texts. First, the scarcity of historical language samples renders unsupervised learning approaches based on large text corpora highly inefficient, hindering effective pre-training. Moreover, due to the considerable temporal gap and complex evolution of ancient scripts, the absence of comprehensive character encoding schemes limits the digitization and computational processing of ancient texts, particularly in early Chinese writing. To address these challenges, we introduce InteChar, a unified and extensible character list that integrates unencoded oracle bone characters with traditional and modern Chinese. InteChar enables consistent digitization and representation of historical texts, providing a foundation for robust modeling of ancient scripts. To evaluate the effectiveness of InteChar, we construct the Oracle Corpus Set (OracleCS), an ancient Chinese corpus that combines expert-annotated samples with LLM-assisted data augmentation, centered on Chinese oracle bone inscriptions. Extensive experiments show that models trained with InteChar on OracleCS achieve substantial improvements across various historical language understanding tasks, confirming the effectiveness of our approach and establishing a solid foundation for future research in ancient Chinese NLP.
Hipólito Sanchíz-Álvarez-de-Toledo, Covadonga Lorenzo-Cueva, María Fernández-Portaencasa
En este artículo se presenta el estudio epigráfico, la datación y la representación digital de una estela romana procedente de Retuerta del Bullaque, en Ciudad Real, apor tando información relevante a par tir de su estudio y documentación mediante fotogrametría y escaneado tridimensional. Aunque se trata de una inscripción no inédita, los autores de su publicación original se basaron en una noticia anónima y no habían podido ver ni leer la estela, ni conocían su procedencia o paradero actual. Además de aportar un valioso estudio de la estela que incluye la descripción, la datación y la traducción de la inscripción, el hallazgo se ha ilustrado mediante una imagen tridimensional obtenida mediante tecnologías digitales, que puede ayudar a mejorar la interpretación o lectura frente a calcos que quizá, puedan resultar menos precisos o a fotografías con poca calidad para su estudio.
Volume 14, Issue 2, 2024I am delighted and honoured to present this issue of the Iranian Journal of Archaeological Studies to you, our esteemed readers. In this note, I will introduce the valuable achievements and diverse articles of this issue.
Discovering valuable insights from data through meaningful associations is a crucial task. However, it becomes challenging when trying to identify representative patterns in quantitative databases, especially with large datasets, as enumeration-based strategies struggle due to the vast search space involved. To tackle this challenge, output space sampling methods have emerged as a promising solution thanks to its ability to discover valuable patterns with reduced computational overhead. However, existing sampling methods often encounter limitations when dealing with large quantitative database, resulting in scalability-related challenges. In this work, we propose a novel high utility pattern sampling algorithm and its on-disk version both designed for large quantitative databases based on two original theorems. Our approach ensures both the interactivity required for user-centered methods and strong statistical guarantees through random sampling. Thanks to our method, users can instantly discover relevant and representative utility pattern, facilitating efficient exploration of the database within seconds. To demonstrate the interest of our approach, we present a compelling use case involving archaeological knowledge graph sub-profiles discovery. Experiments on semantic and none-semantic quantitative databases show that our approach outperforms the state-of-the art methods.
Cities emerged independently across different world regions and historical periods, raising fundamental questions: How did the first urban settlements develop? What social and spatial conditions enabled their emergence? Are these processes universal or context-dependent? Moreover, what distinguishes cities from other human settlements? This paper investigates the drivers behind the creation of cities through a hybrid approach that integrates urban theory, the biological concept of morphospace (the space of all possible configurations), and archaeological evidence. It explores the transition from sedentary hunter-gatherer communities to urban societies, highlighting fundamental forces converging to produce increasingly complex divisions of labour as a central driver of urbanization. Morphogenesis is conceptualized as a trajectory through morphospace, governed by structure-seeking selection processes that balance density, permeability, and information as critical dimensions. The study highlights the non-ergodic nature of urban morphogenesis, where configurations are progressively selected based on their fitness to support the diversifying interactions between mutually dependent agents. The morphospace framework effectively distinguishes between theoretical spatial configurations, non-urban and proto-urban settlements, and contemporary cities. This analysis supports the proposition that cities emerge and evolve as solutions balancing density, permeability, and informational organization, enabling them to support increasingly complex societal functions.
The lower reaches of the Minjiang River are located in an area of land-sea-air interaction. It is not only a sensitive area concerning environmental evolution, but also a hotspot for archaeological research on the southeast coast of China. Exploring the scope of ancient human activities and the evolution of land-use patterns is of great significance for understanding the development of human-land relations. By generating various cost surfaces of sites through GIS cost distance analysis and combining with the site catchment theory and method, we comprehensively analyzed the human activity range and land-use pattern in every cultural period from the Neolithic to the Bronze Age in the lower reaches of the Minjiang River and reconstructed the potential population in each period of the region. The results show that: (1) the area of site catchment, available and actually developed land increased from the Keqiutou to Huangtulun culture period in the lower reaches of the Minjiang River. Chronologically, the area of the site catchment continues to grow from 212 km2 to 4,858 km2, the accessible land area increased from 261 km2 to 7,599 km2, and the land area that was actually used by ancient humans continuously increased from 173 km2 to 3,914 km2. (2) With the development process of culture, the land-use intensity of prehistoric humans in the region first increased and then decreased by 81.58%, 92.95%, 87.99%, 79.33% and 80.57%, respectively, and the degree of land development was reduced, which may be related to the development and progress of productive forces. The frequency of ancient human exchanges can be represented by the ratio of the number of sites in the overlapping site catchment to the total number of sites in the same period (87.50%, 60.00%, 66.67%, 92.54%, and 97.81%, respectively). (3) The reconstruction results showed that the potential population in the area continued to expand, with substantial population growth in the lower Tanshishan-Tanshishan and Huangguashan-Huangtulun transitional periods. It was inferred that there was a relationship between population pressure and culture succession.
Los estudios relativos al poblamiento de los asentamientos rurales altomedievales par ten, cada vez con mayor frecuencia, de un análisis que contempla un repertorio de fuentes escritas y materiales de naturaleza diversa que requiere de entornos interdisciplinares de trabajo. No obstante, la necesaria interdisciplinariedad precisa de una re_exión de carácter epistemológico acerca de cómo gestionar y procesar la información para evitar, en la medida de lo posible, la generación de discursos paralelos en función de las fuentes utilizadas. Esta realidad se agrava en contextos donde la escasez de fuentes es acusada y las existentes sólo re_ejan una realidad muy parcial y fragmentada del territorio. Los espacios de montaña son representativos de este fenómeno y, entre ellos, el Alto Arlanza (Burgos). Este territorio, emblemático por sus necrópolis rupestres, acoge un considerable número de yacimientos excepcionales para el estudio de la formación de estructuras sociales y organizativas durante la tardoantigüedad, y las excavaciones llevadas a cabo en el yacimiento de Revenga desde 2014 son una buena muestra de ello. Nuestro estudio re_exiona sobre los procesos de gestión de la información y plantea algunas estrategias metodológicas para abordar la gestión de las fuentes, la integración de fuentes de naturaleza diversa y su posible (re)interpretación.
Nathanaëlle Courant, Julien Lepiller, Gabriel Scherer
Context: It is common for programming languages that their reference implementation is implemented in the language itself. This requires a "bootstrap": a copy of a previous version of the implementation is provided along with the sources, to be able to run the implementation itself. Those bootstrap files are opaque binaries; they could contain bugs, or even malicious changes that could reproduce themselves when running the source version of the language implementation -- this is called the "trusting trust attack". For this reason, a collective project called Bootstrappable was launched in 2016 to remove those bootstraps, providing alternative build paths that do not rely on opaque binaries. Inquiry: Debootstrapping generally combines a mix of two approaches. The "archaeological" approach works by locating old versions of systems, or legacy alternative implementations, that do not need the bootstrap, and by preserving or restoring the ability to run them. The "tailored" approach re-implements a new, non-bootstrapped implementation of the system to debootstrap. Currently, the "tailored" approach is dominant for low-level system components (C, coreutils), and the "archaeological" approach is dominant among the few higher-level languages that were debootstrapped. Approach: We advocate for the benefits of "tailored" debootstrapping implementations of high-level languages. The new implementation needs not be production-ready, it suffices that it is able to run the reference implementation correctly. We argue that this is feasible with a reasonable development effort, with several side benefits besides debootstrapping. Knowledge: We propose a specific design of composing/stacking several implementations: a reference interpreter for the language of interest, implemented in a small subset of the language, and a compiler for this small subset (in another language). Developing a reference interpreter is valuable independently of debootstrapping: it may help clarify the language semantics, and can be reused for other purposes such as differential testing of the other implementations. Grounding: We present Camlboot, our project to debootstrap the OCaml compiler, version 4.07. Once we converged on this final design, the last version of Camlboot took about a person-month of implementation effort, demonstrating feasibility. Using diverse double-compilation, we were able to prove the absence of trusting trust attack in the existing bootstrap of the standard OCaml implementation. Importance: To our knowledge, this document is the first scholarly discussion of "tailored" debootstrapping for high-level programming languages. Debootstrapping is an interesting problem which recently grew an active community of free software contributors, but so far the interactions with the programming-language research community have been minimal. We share our experience on Camlboot, trying to highlight aspects that are of interest to other language designers and implementors; we hope to foster stronger ties between the Bootstrappable project and relevant academic communities. In particular, the debootstrapping experience has been an interesting reflection on OCaml design and implementation, and we hope that other language implementors would find it equally valuable.
This article describes the study of a Bronze Age limestone slab with cup marks, discovered during the archaeological excavations of kurgan 1 of the kurgan grave field Prolom II, located in the Belogorsk region in Crimea. According to the results of the study, it is concluded that the Belogorsk slab is a sundial of about the XV - XII centuries BC and belongs to the Srubnaya culture. By type of sundial, it is closest to the analemmatic sundial. However, the principle of the hourly markings of the Belogorsk slab is so unique that it is proposed to separate this type of sundial into a new type - inverted analemmatic sundial. This type is characterized by the fact that, unlike typical analemmatic sundials, the gnomon remains motionless throughout the year, and in accordance with the analemma, the "dial" "moves" - an ellipse of hour markers from cup marks, i.e. gnomon and hour markers (cup marks) change places in terms of mobility. The movement of hour markers on the Belogorsk slab is not literal, but is imitated by several rows of cup marks, which are fragments of hour marker ellipses for different months of the year. The idea behind this type of sundial is so revolutionary that we can talk about the discovery of a completely new type of sundial, the analogue of which has not yet been discovered. Keywords: cup marks, sundial, inverted, analemma, true solar time, mean solar time, hour markers, Srubnaya culture, Bronze Age, kurgan grave field, slab.
The use of conventional imaging techniques becomes problematic when faced with challenging logistics and confined environments. In particular, such scenarios are not unusual in the field of archaeological and mining explorations as well as for nuclear waste characterization. For these applications, even the use of muography is complicated since the detectors have to be deployed in difficult areas with limited room for instrumentation, e.g., narrow tunnels. To address this limitation, we have developed a portable muon detector (muoscope) based on glass Resistive Plate Chambers (RPC) with an active area of 16 $\times$ 16 cm$^{2}$. The specific design goals taken into consideration while developing our first prototype are portability, robustness, autonomy, versatility, safety and low cost. To help further improve our design goals, we are currently studying the possibility to switch the sensitive units from strips in the old prototype to pixels for the new one However, for performing high resolution muography, the number of readout units per layer will also need to increase significantly, leading to increase in the overall cost and power consumption of the muoscope. To mitigate these issues, we are developing a novel 2D multiplexing algorithm for reading out several pixels with a single electronic channel. In this article, we give an overview of the detector development, focusing mainly on the design goals and the choice of detector technology. Furthermore, we present the details of the expected changes in the new prototype as well as a simulated 2D multiplexing study based on general principles.
Anja Furtwängler, A. B. Rohrlach, Thiseas C. Lamnidis
et al.
European populations underwent strong genetic changes during the Neolithic. Here, Furtwängler et al. provide ancient nuclear and mitochondrial genomic data from the region of Switzerland during the end of the Neolithic and the Early Bronze Age that reveal a complex genetic turnover during the arrival of steppe ancestry.
We consider the problem of simultaneous variable selection and estimation of the corresponding regression coefficients in an ultra-high dimensional linear regression models, an extremely important problem in the recent era. The adaptive penalty functions are used in this regard to achieve the oracle variable selection property along with easier computational burden. However, the usual adaptive procedures (e.g., adaptive LASSO) based on the squared error loss function is extremely non-robust in the presence of data contamination which are quite common with large-scale data (e.g., noisy gene expression data, spectra and spectral data). In this paper, we present a regularization procedure for the ultra-high dimensional data using a robust loss function based on the popular density power divergence (DPD) measure along with the adaptive LASSO penalty. We theoretically study the robustness and the large-sample properties of the proposed adaptive robust estimators for a general class of error distributions; in particular, we show that the proposed adaptive DPD-LASSO estimator is highly robust, satisfies the oracle variable selection property, and the corresponding estimators of the regression coefficients are consistent and asymptotically normal under easily verifiable set of assumptions. Numerical illustrations are provided for the mostly used normal error density. Finally, the proposal is applied to analyze an interesting spectral dataset, in the field of chemometrics, regarding the electron-probe X-ray microanalysis (EPXMA) of archaeological glass vessels from the 16th and 17th centuries.