J. Ash, Rob Kitchin, Agnieszka Leszczynski
Hasil untuk "Geography"
Menampilkan 20 dari ~2240033 hasil · dari CrossRef, arXiv, DOAJ, Semantic Scholar
Gordon H. Hanson, Gordon H. Hanson
Treb Allen, Costas Arkolakis, Costas Arkolakis et al.
Simon Rudkin, Wanling Rudkin
Topological Data Analysis Ball Mapper (TDABM) offers a model-free visualization of multivariate data which does not necessitate the information loss associated with dimensionality reduction. TDABM Dlotko (2019) produces a cover of a multidimensional point cloud using equal size balls, the radius of the ball is the only parameter. A TDABM visualization retains the full structure of the data. The graphs produced by TDABM can convey coloration according to further variables, model residuals, or variables within the multivariate data. An expanding literature makes use of the power of TDABM across Finance, Economics, Geography, Medicine and Chemistry amongst others. We provide an introduction to TDABM and the \texttt{ballmapper} package for Stata.
Eleni Oikonomaki, Belivanis Dimitris, Kakderi Christina
The geography of innovation offers a framework to understand how territorial characteristics shape innovation, often via spatial and cognitive proximity. Empirical research has focused largely on national and regional scales, while urban and sub-regional geographies receive less attention. Local studies typically rely on limited indicators (e.g., firm-level data, patents, basic socioeconomic measures), with few offering a systematic framework integrating urban form, mobility, amenities, and human-capital proxies at the neighborhood scale. Our study investigates innovation at a finer spatial resolution, going beyond proprietary or static indicators. We develop the Local Innovation Determinants (LID) database and framework to identify key enabling factors across regions, combining traditional government data with publicly available data via APIs for a more granular understanding of spatial dynamics shaping innovation capacity. Using exploratory big and geospatial data analytics and random forest models, we examine neighborhoods in New York and Massachusetts across four dimensions: social factors, economic characteristics, land use and mobility, morphology, and environment. Results show that alternative data sources offer significant yet underexplored potential to enhance insights into innovation dynamics. City policymakers should consider neighborhood-specific determinants and characteristics when designing and implementing local innovation strategies.
Pingping Wang, Yihong Yuan, Lingcheng Li et al.
PyGALAX is a Python package for geospatial analysis that integrates automated machine learning (AutoML) and explainable artificial intelligence (XAI) techniques to analyze spatial heterogeneity in both regression and classification tasks. It automatically selects and optimizes machine learning models for different geographic locations and contexts while maintaining interpretability through SHAP (SHapley Additive exPlanations) analysis. PyGALAX builds upon and improves the GALAX framework (Geospatial Analysis Leveraging AutoML and eXplainable AI), which has proven to outperform traditional geographically weighted regression (GWR) methods. Critical enhancements in PyGALAX from the original GALAX framework include automatic bandwidth selection and flexible kernel function selection, providing greater flexibility and robustness for spatial modeling across diverse datasets and research questions. PyGALAX not only inherits all the functionalities of the original GALAX framework but also packages them into an accessible, reproducible, and easily deployable Python toolkit while providing additional options for spatial modeling. It effectively addresses spatial non-stationarity and generates transparent insights into complex spatial relationships at both global and local scales, making advanced geospatial machine learning methods accessible to researchers and practitioners in geography, urban planning, environmental science, and related fields.
Sjoerd Halmans, Lavinia Paganini, Alexander Serebrenik et al.
Hackathons are time-bound collaborative events that often target software creation. Although hackathons have been studied in the past, existing work focused on in-depth case studies limiting our understanding of hackathons as a software engineering activity. To complement the existing body of knowledge, we introduce HackRep, a dataset of 100,356 hackathon GitHub repositories. We illustrate the ways HackRep can benefit software engineering researchers by presenting a preliminary investigation of hackathon project continuation, hackathon team composition, and an estimation of hackathon geography. We further display the opportunities of using this dataset, for instance showing the possibility of estimating hackathon durations based on commit timestamps.
Baohang Wang, Guangrong Li, Chaoying Zhao et al.
Potential tropospheric noise is a critical factor that undermines the effectiveness of deformation monitoring in Synthetic Aperture Radar Interferometry (InSAR) technologies. In most scenarios, many point targets within the InSAR deformation monitoring area either do not undergo deformation or exhibit only minimal deformation trends. The phases of densely distributed stable points can effectively respond to spatial tropospheric delays, particularly turbulent atmospheric phases. This study proposes a data-driven InSAR atmospheric correction method by exploring how to use these densely stable InSAR time series to model atmospheric phase delays. Our focus is on selecting stable InSAR time series point targets and evaluating the impact of different densities of stable points on atmospheric correction performance. Analysis of 645 interferograms derived from 217 Sentinel-1A SAR images, spanning from 13 June 2017 to 15 November 2024, demonstrates that the proposed method reduces the Root Mean Square Error (RMSE) by 70%, 59%, and 69% compared to the terrain-related linear approach, the General Atmospheric Correction Online Service, and common scene stacking methods, respectively. In addition, simulation data and leveling data were used to validate the proposed method. This article does not develop an independent InSAR atmospheric correction method. Instead, the proposed approach starts with the InSAR deformation time series, allowing for easy integration into existing InSAR workflows and widely used atmospheric correction strategies. It can serve as a post-processing tool to improve InSAR time series analysis.
Mykyta Syromiatnikov, Victoria Ruvinskaya, Anastasiya Troynina
As the usage of large language models for problems outside of simple text understanding or generation increases, assessing their abilities and limitations becomes crucial. While significant progress has been made in this area over the last few years, most research has focused on benchmarking English, leaving other languages underexplored. This makes evaluating the reasoning and robustness level of language models in Ukrainian particularly challenging. The purpose of this work is to establish a comprehensive benchmark for the reasoning capabilities evaluation of large language models in the Ukrainian language. This paper presents the ZNO-Eval benchmark based on real exam tasks from Ukraine's standardized educational testing system: the External Independent Evaluation and the National Multi-subject Test. With single-answer options, multiple-choice, matching, and open-ended questions from diverse subjects, including Ukrainian language, mathematics, history, and geography, this dataset paves the way toward a thorough analysis of reasoning capabilities across different domains and complexities. Evaluation of several well-known language models, such as GPT-3.5-Turbo, GPT-4o, GPT-4-Turbo, Mistral Large, Claude 3 Opus, and Gemini-1.5 Pro on this benchmark demonstrated the superiority of GPT-4o in both common knowledge reasoning and intricate language tasks. At the same time, Gemini Pro and GPT-4 Turbo excelled in the arithmetic domain, leading in single-answer and open-ended math problems. While all models were close to max performance in text-only common knowledge tasks like history and geography, there still is a gap for Ukrainian language and math, thus highlighting the importance of developing specialized language benchmarks for more accurate assessments of model capabilities and limitations across different languages and contexts.
Luca F. Di Cerbo
Any oriented $4$-dimensional Einstein metric with semi-definite sectional curvature satisfies the pointwise inequality \[ \frac{|s|}{\sqrt{6}}\geq|W^+|+|W^-|, \] where $s$, $W^+$ and $W^-$ are respectively the scalar curvature, the self-dual and anti-self-dual Weyl curvatures. We give a complete characterization of closed $4$-dimensional Einstein metrics with semi-definite sectional curvature saturating this pointwise inequality. We then present further consequences of this circle of ideas, in particular to the study of the geography of non-positively curved closed Einstein and Kaehler-Einstein $4$-manifolds. In the Kaehler-Einstein case, we obtain a sharp Gromov-Lueck type inequality.
Lynnette Hui Xian Ng, Kathleen M. Carley
Social Cyber Geography is the space in the digital cyber realm that is produced through social relations. Communication in the social media ecosystem happens not only because of human interactions, but is also fueled by algorithmically controlled bot agents. Most studies have not looked at the social cyber geography of bots because they focus on bot activity within a single country. Since creating a bot uses universal programming technology, bots, how prevalent are these bots throughout the world? To quantify bot activity worldwide, we perform a multilingual and geospatial analysis on a large dataset of social data collected from X during the Coronavirus pandemic in 2021. This pandemic affected most of the world, and thus is a common topic of discussion. Our dataset consists of ~100 mil posts generated by ~31mil users. Most bot studies focus only on English-speaking countries, because most bot detection algorithms are built for the English language. However, only 47\% of the bots write in the English language. To accommodate multiple languages in our bot detection algorithm, we built Multilingual BotBuster, a multi-language bot detection algorithm to identify the bots in this diverse dataset. We also create a Geographical Location Identifier to swiftly identify the countries a user affiliates with in his description. Our results show that bots can appear to move from one country to another, but the language they write in remains relatively constant. Bots distribute narratives on distinct topics related to their self-declared country affiliation. Finally, despite the diverse distribution of bot locations around the world, the proportion of bots per country is about 20%. Our work stresses the importance of a united analysis of the cyber and physical realms, where we combine both spheres to inventorize the language and location of social media bots and understand communication strategies.
Jacopo Lenti, Lorenzo Costantini, Ariadna Fosch et al.
It is increasingly important to generate synthetic populations with explicit coordinates rather than coarse geographic areas, yet no established methods exist to achieve this. One reason is that latitude and longitude differ from other continuous variables, exhibiting large empty spaces and highly uneven densities. To address this, we propose a population synthesis algorithm that first maps spatial coordinates into a more regular latent space using Normalizing Flows (NF), and then combines them with other features in a Variational Autoencoder (VAE) to generate synthetic populations. This approach also learns the joint distribution between spatial and non-spatial features, exploiting spatial autocorrelations. We demonstrate the method by generating synthetic homes with the same statistical properties of real homes in 121 datasets, corresponding to diverse geographies. We further propose an evaluation framework that measures both spatial accuracy and practical utility, while ensuring privacy preservation. Our results show that the NF+VAE architecture outperforms popular benchmarks, including copula-based methods and uniform allocation within geographic areas. The ability to generate geolocated synthetic populations at fine spatial resolution opens the door to applications requiring detailed geography, from household responses to floods, to epidemic spread, evacuation planning, and transport modeling.
Nadia Boutaleb
Christian Mancas, Diana Christina Mancas
Presented are algorithms for enforcing function diagram commutativity and anti-commutativity database constraints, using the database software application constraint-driven design and development methodology, in the realm of the (Elementary) Mathematical Data Model ((E)MDM). MatBase, an intelligent data and knowledge management system prototype mainly based on the (E)MDM, uses these algorithms to automatically generate corresponding code in both its versions (i.e., the MS Access and the .NET and SQL Server ones). Of course, any software developer may also use these algorithms manually. The paper also discusses the code generated to enforce two such constraints from a Geography database.
Zeyuan Hu, Akshay Subramaniam, Zhiming Kuang et al.
Modern climate projections often suffer from inadequate spatial and temporal resolution due to computational limitations, resulting in inaccurate representations of sub-grid processes. A promising technique to address this is the Multiscale Modeling Framework (MMF), which embeds a kilometer-resolution cloud-resolving model within each atmospheric column of a host climate model to replace traditional convection and cloud parameterizations. Machine learning (ML) offers a unique opportunity to make MMF more accessible by emulating the embedded cloud-resolving model and reducing its substantial computational cost. Although many studies have demonstrated proof-of-concept success of achieving stable hybrid simulations, it remains a challenge to achieve near operational-level success with real geography and comprehensive variable emulation that includes, for example, explicit cloud condensate coupling. In this study, we present a stable hybrid model capable of integrating for at least 5 years with near operational-level complexity, including coarse-grid geography, seasonality, explicit cloud condensate and wind predictions, and land coupling. Our model demonstrates skillful online performance, achieving a 5-year zonal mean tropospheric temperature bias within 2K, water vapor bias within 1 g/kg, and a precipitation RMSE of 0.96 mm/day. Key factors contributing to our online performance include an expressive U-Net architecture and physical thermodynamic constraints for microphysics. With microphysical constraints mitigating unrealistic cloud formation, our work is the first to demonstrate realistic multi-year cloud condensate climatology under the MMF framework. Despite these advances, online diagnostics reveal persistent biases in certain regions, highlighting the need for innovative strategies to further optimize online performance.
Cheng-Hong Yang, Chih-Hsien Wu, Kuei-Hau Luo et al.
Air pollution has become a major global threat to human health. Urbanization and industrialization over the past few decades have increased the air pollution. Plausible connections have been made between air pollutants and dementia. This study used machine learning algorithms (k-nearest neighbors, random forest, gradient-boosted decision trees, eXtreme gradient boosting, and CatBoost) to investigate the association between cognitive impairment and air pollution. Data from the Taiwan Biobank and 75 air-pollution-monitoring stations in Taiwan were analyzed to determine individual levels of exposure to air pollutants. The pollutants examined were particulate matter with a diameter of ≤ 2.5 μm (PM2.5), nitrogen dioxide, nitric oxide, carbon monoxide, and ozone. The results revealed that the most strongly correlated with cognitive impairment were ozone, PM2.5, and carbon monoxide levels with adjustment of educational level, age, and household income. The model based on these factors achieved accuracy as high as 0.97 for detecting cognitive impairment, indicating a positive association between air pollutions and cognitive impairment.
Marcos Plischuk, Rocío García Mancuso, Bárbara Desántolo
mohammed nazhan mahdi
Shevelev Sergey, Mikhaylov Pavel, Vorobeva Irina et al.
The work is based on materials from 15 sample plots laid out in mixed spruce stands of the Yenisei Ridge, with a relative completeness in the range of 1.62-1.27, belonging to the green moss group of forest types. Based on the data obtained, the features of changes in spruce species heights and their interdependence with other taxation characteristics were established. The results of data processing indicate the need to develop a set of taxation standards for the Yenisei Ridge region.
Silvia De Nicolò, Enrico Fabrizi, Aldo Gardini
Poverty mapping is a powerful tool to study the geography of poverty. The choice of the spatial resolution is central as poverty measures defined at a coarser level may mask their heterogeneity at finer levels. We introduce a small area multi-scale approach integrating survey and remote sensing data that leverages information at different spatial resolutions and accounts for hierarchical dependencies, preserving estimates coherence. We map poverty rates by proposing a Bayesian Beta-based model equipped with a new benchmarking algorithm that accounts for the double-bounded support. A simulation study shows the effectiveness of our proposal and an application on Bangladesh is discussed.
Halaman 17 dari 112002