JAMMEval: A Refined Collection of Japanese Benchmarks for Reliable VLM Evaluation
Issa Sugiura, Koki Maeda, Shuhei Kurita
et al.
Reliable evaluation is essential for the development of vision-language models (VLMs). However, Japanese VQA benchmarks have undergone far less iterative refinement than their English counterparts. As a result, many existing benchmarks contain issues such as ambiguous questions, incorrect answers, and instances that can be solved without visual grounding, undermining evaluation reliability and leading to misleading conclusions in model comparisons. To address these limitations, we introduce JAMMEval, a refined collection of Japanese benchmarks for reliable VLM evaluation. It is constructed by systematically refining seven existing Japanese benchmark datasets through two rounds of human annotation, improving both data quality and evaluation reliability. In our experiments, we evaluate open-weight and proprietary VLMs on JAMMEval and analyze the capabilities of recent models on Japanese VQA. We further demonstrate the effectiveness of our refinement by showing that the resulting benchmarks yield evaluation scores that better reflect model capability, exhibit lower run-to-run variance, and improve the ability to distinguish between models of different capability levels. We release our dataset and code to advance reliable evaluation of VLMs.
Magnetic Interference Correction Method for Dynamic Measurement of Steering Head and Its Application
Zhu Jinsong, Song Xiaojian, Yu Yanfei
et al.
During the operation of the rotary steering system, the magnetic interference faced by the steering head when performing dynamic measurement tasks mainly comes from two aspects: one is the magnetic influence of the internal current fluctuation of the M30 bus, and the other is the external magnetic interference in the real-time environment of the drilling operation. Specifically, when the current intensity of the M30 bus is greater than 1 A, an interference magnetic field will be generated, which directly challenges the magnetic measurement accuracy of the steering head. Meanwhile, the elements such as bit and mud flow involved in drilling further aggravate the degree of magnetic interference, which not only significantly reduces the magnetic measurement accuracy of the steering head, but also may mislead the judgment of the azimuth gamma and resistivity imaging areas, leading to engineering errors. To address the above problems, a magnetic interference correction scheme for the steering head was innovatively proposed. First, magnetic interference regression analysis was conducted on the current of the M30 bus. Then, by means of integrating attitude measurement principles with multi-point analysis techniques, the single-point correction and the multi-point correction model optimized by an adaptive genetic algorithm were creatively applied to conduct deep and precise calibration on the three-axis fluxgate sensor. Analysis on the data obtained after application in DengX well shows that this method reduces the dynamic magnetic measurement azimuth angle error of the rotary steering system by 85%. Further application in multiple well clusters in Huabei Oilfield and Changqing Oilfield shows that the azimuth angle consistency of static magnetic measurement is improved by 13%. These application practices prove the effectiveness of the magnetic interference correction method in significantly improving the dynamic magnetic measurement accuracy of rotary steering.
Chemical engineering, Petroleum refining. Petroleum products
Prediction method of pore pressure of carbonate rock and shale considering regional differences - a case study of the Luzhou block in southern Sichuan
Jianhua Guo, Yong Ma, Yatian Li
et al.
Abstract Luzhou, in the southern Sichuan Basin, is one of China’s major shale-gas production areas; however, its extensive Permian carbonate formations frequently exhibit gas-logging anomalies and minor gas intrusion during drilling—both in carbonate and shale strata—severely impairing drilling efficiency. In this study, a two-pronged modeling strategy is introduced. First, based on porous–elastic coupling theory, a pore-pressure prediction model for carbonate and shale formations is established, subdividing the reservoir into high- and low-pressure zones to enhance precision. Second, the conventional Eaton model is applied across the full well section. Model performance is assessed against measured pressure data from the B and C wells. In the carbonate interval, the porous–elastic model achieved an average error of 6.02%, compared with 39.06% for the Eaton model. In the shale interval, the two approaches produced comparable errors of 8.75% for the fluid–solid coupling method and 3.30% for the Eaton method. The novel incorporation of stress-dependent zonal partitioning into the porous-elastic framework represents a significant advance beyond single-model approaches reported in previous literature. These findings demonstrate that the coupled model markedly improves pressure estimates in carbonate strata, while the Eaton model remains appropriate for shale. This hybrid methodology offers robust, quantitative support for guiding drilling design and risk mitigation in complex geological settings.
Petroleum refining. Petroleum products, Petrology
Repairing Language Model Pipelines by Meta Self-Refining Competing Constraints at Runtime
Mojtaba Eshghie
Language Model (LM) pipelines can dynamically refine their outputs against programmatic constraints. However, their effectiveness collapses when faced with competing soft constraints, leading to inefficient backtracking loops where satisfying one constraint violates another. We introduce Meta Self-Refining, a framework that equips LM pipelines with a meta-corrective layer to repair these competitions at runtime/inference-time. Our approach monitors the pipeline's execution history to detect oscillatory failures. Upon detection, it invokes a meta-repairer LM that analyzes the holistic state of the backtracking attempts and synthesizes a strategic instruction to balance the competing requirements. This self-repair instruction guides the original LM out of a failing refining loop towards a successful output. Our results show Meta Self-Refining can successfully repair these loops, leading to more efficient LM programs.
2024年第14卷第4期目录
Petroleum refining. Petroleum products, Gas industry
Experimental and simulation study on fracture conductivity of acid-fracturing in Dengying Formation of Sichuan Basin
CHEN Xiang, WANG Guan, LIU Pingli
et al.
Acid fracturing is a critical stimulation technology for enhancing production in ultra-deep marine carbonate reservoirs. A significant challenge in this process is maintaining the conductivity of acid-etched fractures under ultra-high temperature and high closure stress conditions. To address this, conductivity experiments were conducted using various acid solutions and their combinations. The morphology of the acid-etched fractures was captured using a three-dimensional laser scanner. The degree of fracture closure was analyzed using the Airy stress function and the complex variable method, integrated with the local cubic law and an acid fracturing model to create a numerical calculation method for evaluating the conductivity of acid-etched fractures. The results show that under high closure stress(90 MPa), the conductivity of acids and their combinations decreases by an order of magnitude compared to low closure stress(5 MPa). As closure stress increases, different acids and combinations exhibit distinct patterns of conductivity reduction, with potential for two rapid decline phases. Furthermore, specific acid combinations have been identified that enhance the conductivity of fractures under extreme conditions of temperature and pressure. The average error between the conductivity values calculated by the model and those obtained from experimental results is relatively low, about 10.6%, indicating that the model can effectively characterize the distribution and magnitude of conductivity across different points within the fracture. In Sichuan Basin, under identical engineering parameters, the conductivity of acid-etched fractures in the 4th member of Dengying Formation is higher than that in the 2nd member. This research provides valuable theoretical guidance for optimizing the design of acid fracturing stimulation schemes in ultra-deep marine carbonate rocks in Sichuan Basin.
Petroleum refining. Petroleum products, Gas industry
Diffusion-based Pose Refinement and Muti-hypothesis Generation for 3D Human Pose Estimaiton
Hongbo Kang, Yong Wang, Mengyuan Liu
et al.
Previous probabilistic models for 3D Human Pose Estimation (3DHPE) aimed to enhance pose accuracy by generating multiple hypotheses. However, most of the hypotheses generated deviate substantially from the true pose. Compared to deterministic models, the excessive uncertainty in probabilistic models leads to weaker performance in single-hypothesis prediction. To address these two challenges, we propose a diffusion-based refinement framework called DRPose, which refines the output of deterministic models by reverse diffusion and achieves more suitable multi-hypothesis prediction for the current pose benchmark by multi-step refinement with multiple noises. To this end, we propose a Scalable Graph Convolution Transformer (SGCT) and a Pose Refinement Module (PRM) for denoising and refining. Extensive experiments on Human3.6M and MPI-INF-3DHP datasets demonstrate that our method achieves state-of-the-art performance on both single and multi-hypothesis 3DHPE. Code is available at https://github.com/KHB1698/DRPose.
RHanDS: Refining Malformed Hands for Generated Images with Decoupled Structure and Style Guidance
Chengrui Wang, Pengfei Liu, Min Zhou
et al.
Although diffusion models can generate high-quality human images, their applications are limited by the instability in generating hands with correct structures. In this paper, we introduce RHanDS, a conditional diffusion-based framework designed to refine malformed hands by utilizing decoupled structure and style guidance. The hand mesh reconstructed from the malformed hand offers structure guidance for correcting the structure of the hand, while the malformed hand itself provides style guidance for preserving the style of the hand. To alleviate the mutual interference between style and structure guidance, we introduce a two-stage training strategy and build a series of multi-style hand datasets. In the first stage, we use paired hand images for training to ensure stylistic consistency in hand refining. In the second stage, various hand images generated based on human meshes are used for training, enabling the model to gain control over the hand structure. Experimental results demonstrate that RHanDS can effectively refine hand structure while preserving consistency in hand style.
OpenCodeInterpreter: Integrating Code Generation with Execution and Refinement
Tianyu Zheng, Ge Zhang, Tianhao Shen
et al.
The introduction of large language models has significantly advanced code generation. However, open-source models often lack the execution capabilities and iterative refinement of advanced systems like the GPT-4 Code Interpreter. To address this, we introduce OpenCodeInterpreter, a family of open-source code systems designed for generating, executing, and iteratively refining code. Supported by Code-Feedback, a dataset featuring 68K multi-turn interactions, OpenCodeInterpreter integrates execution and human feedback for dynamic code refinement. Our comprehensive evaluation of OpenCodeInterpreter across key benchmarks such as HumanEval, MBPP, and their enhanced versions from EvalPlus reveals its exceptional performance. Notably, OpenCodeInterpreter-33B achieves an accuracy of 83.2 (76.4) on the average (and plus versions) of HumanEval and MBPP, closely rivaling GPT-4's 84.2 (76.2) and further elevates to 91.6 (84.6) with synthesized human feedback from GPT-4. OpenCodeInterpreter brings the gap between open-source code generation models and proprietary systems like GPT-4 Code Interpreter.
Structural Optimization of Suction Pile Template for Deep-water Shallow Soft Formations
Ma Baojin, Zhang Aixia
Abundant oil and gas resources are found in the deep waters of China South Sea,while drilling operation in this region is faced with severe challenges such as adverse environmental conditions and high geology risks.This paper presents a new type of suction pile template suitable for China South Sea to satisfy the requirement of drilling operation in shallow soft formations.This suction pile template is characterized by simple structure,low construction cost,high installation efficiency and easy re-usability.The capacity of such large-diameter barrel structure was analyzed by the API method and finite element analysis(FEA)methods.The weakening effect of cyclic loading on the ultimate bearing capacity of suction pile template has been evaluated using the constitutive model of soil body constructed through secondary development.The analysis of engineering example reveals the features of the suction pile template.The study demonstrates that the new type of suction pile template can improve the bearing capacity of the whole wellhead system greatly and provide a safe,efficient,economical and reliable solution for deep water drilling in deep-water shallow soft formations.The conclusions and the proposed method provide technical guidance for engineering purposes.
Chemical engineering, Petroleum refining. Petroleum products
Refined and refined harmonic Jacobi--Davidson methods for computing several GSVD components of a large regular matrix pair
Jinzhi Huang, Zhongxiao Jia
Three refined and refined harmonic extraction-based Jacobi--Davidson (JD) type methods are proposed, and their thick-restart algorithms with deflation and purgation are developed to compute several generalized singular value decomposition (GSVD) components of a large regular matrix pair. The new methods are called refined cross product-free (RCPF), refined cross product-free harmonic (RCPF-harmonic) and refined inverse-free harmonic (RIF-harmonic) JDGSVD algorithms, abbreviated as RCPF-JDGSVD, RCPF-HJDGSVD and RIF-HJDGSVD, respectively. The new JDGSVD methods are more efficient than the corresponding standard and harmonic extraction-based JDSVD methods proposed previously by the authors, and can overcome the erratic behavior and intrinsic possible non-convergence of the latter ones. Numerical experiments illustrate that RCPF-JDGSVD performs better for the computation of extreme GSVD components while RCPF-HJDGSVD and RIF-HJDGSVD suit better for that of interior GSVD components.
Optimal trees of tangles: refining the essential parts
Sandra Albrechtsen
We combine the two fundamental fixed-order tangle theorems of Robertson and Seymour into a single theorem that implies both, in a best possible way. We show that, for every $k \in \mathbb{N}$, every tree-decomposition of a graph $G$ which efficiently distinguishes all its $k$-tangles can be refined to a tree-decomposition whose parts are either too small to be home to a $k$-tangle, or as small as possible while being home to a $k$-tangle.
An intelligent identification method of safety risk while drilling in gas drilling
Wanjun HU, Wenhe XIA, Yongjie LI
et al.
In view of the shortcomings of current intelligent drilling technology in drilling condition representation, sample collection, data processing and feature extraction, an intelligent identification method of safety risk while drilling was established. The correlation analysis method was used to determine correlation parameters indicating gas drilling safety risk. By collecting monitoring data in the safety risk period of more than 20 wells, a sample database of a variety of safety risks in gas drilling was established, and the number of samples was expanded by using the method of few-shot learning. According to the forms of gas drilling monitoring data samples, a two-layer convolution neural network architecture was designed, and multiple convolution cores of different sizes and weights were set to realize the vertical and horizontal convolution computations of samples to extract and learn the variation law and correlation characteristics of multiple monitoring parameters. Finally, based on the training results of neural network, samples of different kinds of safety risks were selected to enhance the recognition accuracy. Compared with the traditional BP (error back propagation) full-connected neural network architecture, this method can more deeply and effectively identify safety risk characteristics in gas drilling, and thus identify and predict risks in advance, which is conducive to avoid and quickly solve safety risks while drilling. Field application has proved that this method has an identification accuracy of various safety risks while drilling in the process of gas drilling of about 90% and is practical.
Petroleum refining. Petroleum products
Adjustment for Unmeasured Spatial Confounding in Settings of Continuous Exposure Conditional on the Binary Exposure Status: Conditional Generalized Propensity Score-Based Spatial Matching
Honghyok Kim, Michelle Bell
Propensity score (PS) matching to estimate causal effects of exposure is biased when unmeasured spatial confounding exists. Some exposures are continuous yet dependent on a binary variable (e.g., level of a contaminant (continuous) within a specified radius from residence (binary)). Further, unmeasured spatial confounding may vary by spatial patterns for both continuous and binary attributes of exposure. We propose a new generalized propensity score (GPS) matching method for such settings, referred to as conditional GPS (CGPS)-based spatial matching (CGPSsm). A motivating example is to investigate the association between proximity to refineries with high petroleum production and refining (PPR) and stroke prevalence in the southeastern United States. CGPSsm matches exposed observational units (e.g., exposed participants) to unexposed units by their spatial proximity and GPS integrated with spatial information. GPS is estimated by separately estimating PS for the binary status (exposed vs. unexposed) and CGPS on the binary status. CGPSsm maintains the salient benefits of PS matching and spatial analysis: straightforward assessments of covariate balance and adjustment for unmeasured spatial confounding. Simulations showed that CGPSsm can adjust for unmeasured spatial confounding. Using our example, we found positive association between PPR and stroke prevalence. Our R package, CGPSspatialmatch, has been made publicly available.
VLSI-Inspired Methods for Student Learning Community Creation and Refinement
Sheng Lun Cao
COVID-19 significantly disrupted how educational contents are delivered in academic institutions, rapidly accelerating the adoption of online and blended learning. This thesis explores the creation and refinement of optimized student learning communities as a mean to support online and blended learning in the pandemic and post-pandemic setting. Students enrolled in university courses can be modeled as an enrollment network akin to a circuit netlist. Learning communities are created by clustering students into groups, optimized for maximum internal connection to support student learning, and minimum external connection to reduce disease transmission. Three VLSI-based clustering algorithms: Hyperedge Coarsening, Modified Hyperedge Coarsening, and Best Choice, are modified to cluster student enrollment networks. The learning communities created by the clustering algorithms are further refined by the Simulated Annealing algorithm using the same optimization criteria. The Learning Community Creation and Refinement Framework combines all three stages of network modeling, learning community creation, and learning community refinement. The proposed framework is tested on both the 3rd year Electrical Engineering Fall 2020 enrollment dataset and a very large Fall 2020 and Winter 2021 enrollment dataset. Best Choice performed the best among the clustering algorithms, capable of creating learning communities for the optimization criteria for a given maximum cluster size. Simulated Annealing can refine the clustering results by significantly increase cluster quality. The framework is capable of creating and refining learning communities for both the small and the large enrollment networks, but it is better suited for creating tailored learning communities at a program level. Future work, including creating student learning communities based on other optimization criteria, should be explored.
Further Refining Swampland dS Conjecture in Mimetic f(G) Gravity
S. Noori Gashti, J. Sadeghi, M. R. Alipour
Mimetic gravity analysis has been studied as a theory in various types of general relativity extensions, such as mimetic f(R) gravity, mimetic f(R, T) gravity, mimetic f(R, G) gravity, etc., in the literature. This paper presents a set of equations arising from mimetic conditions and studies cosmic inflation with a combination of mimetic f(G) gravity and swampland dS conjectures. We analyze and evaluate these results. Therefore, we first thoroughly introduce the mimetic f(G) gravity and calculate some cosmological parameters such as the scalar spectral index, the tensor-to-scalar ratio, and the slow-roll parameters. Also, we investigate the potential according to the mimetic f(G) gravity. Then we will challenge the swampland dS conjectures with this condition. By expressing the coefficient of swampland dS conjectures viz $C_{1}$ and $C_2$ in terms of $n_{s}$ and $r$, we plot some figures and determine the allowable range for each of these cosmological parameters and these coefficients, and finally, compare these results with observable data such as Planck and BICEP2/Keck array data. We show $C_{1}$ and $C_2$ are not $\mathcal{O}(1)$, so the refining swampland dS conjecture is not satisfied for this inflationary model. Then we examine it with further refining swampland dS conjecture, which has a series of free parameters such as $a,b>0$, $q>2$, and $a+b=1$. By adjusting these parameters, the compatibility of the mentioned conjecture with the inflationary model can be discussed. We determine the further refining swampland dS conjecture is satisfied. when $a < \frac{1}{1.00489}=0.99513$, we can always find $a$, $b$ and $q$ whose value is larger than 2, viz for $q=2.4$, we find $0.99185\leq a < 1$, which we can choose $a=0.99235$ according to the condition $a < 0.99513$. Also we know $b=1-a$, so we will have $1-0.99235=0.00765 > 0$.
Cost-benefit analysis of gasoline demand control policies and its greenhouse gas mitigation co-benefits
M. Moradi, M. Salimi, Majid Amidpour
8 sitasi
en
Environmental Science
Impact of filler on the soft phase molecular dynamics of PP/EPDM/SiO2 thermoplastic elastomeric nano-composites
Sayed Z. Mohammady, Khalid S. Khairou
We investigated the impact of silica nanoparticles on the complex shear moduli and molecular dynamics of elastomeric nano-composites formed from ethylene propylene diene monomer (EPDM) and polypropylene (PP). The blended system was composed of immiscible amorphous/semi-crystalline polymers. The complex shear moduli of the EPDM/PP/SiO2 nano-composites and their corresponding neat filler blends were measured at different shear rates in the temperature window − 95–50 °C. The impacts of filler on the stress–strain and thermal behaviors were also investigated. The nano-composites exhibited higher shear moduli than the corresponding neat polymer blends. This trend was linked to the hindered glass-relaxation dynamics of the EPDM polymer in the nano-composites.
Petroleum refining. Petroleum products
The Occurrence and Distribution of Oleanane Biomarkers in Crude Oils as an Index
Onojake Chukunedum, Selegha Abrakasa
Oleanane biomarkers are age diagnostic indicators and source of organic matter input in fossil fuels which can unravels the stage of development of a petroleum system in the petroleum generating rock. Representative samples of crude oil obtained from two separate fields in the southern Nigeria province were evaluated geochemically using Gas Chromatography–Mass Spectrometry. The data obtained from the analysis of the crude oil samples revealed the presence of the 18α (H)-oleanane biomarker. The occurrence of 18α (H)-oleanane biomarker in the Niger Delta oils provides diagnostic evidence on age and organic matter source in all samples. The results of the Oleanane indices for all samples ranged from 0.32 to 1.03 (> 0.30), which suggests that the oils are from Tertiary age source rocks with resilient terrestrial organic matter input. Only crude oil sample KD03 had an Oleanane index of 0.03 (< 0.30), which shows crude oils derived from a late cretaceous or younger age with some marine input. The presence of 18α (H)-Oleanane in Niger Delta crude oils is a confirmation of an earlier postulation that most crude oils from the region had a greater terrestrial organic matter input.
Petroleum refining. Petroleum products
Refining Language Models with Compositional Explanations
Huihan Yao, Ying Chen, Qinyuan Ye
et al.
Pre-trained language models have been successful on text classification tasks, but are prone to learning spurious correlations from biased datasets, and are thus vulnerable when making inferences in a new domain. Prior work reveals such spurious patterns via post-hoc explanation algorithms which compute the importance of input features. Further, the model is regularized to align the importance scores with human knowledge, so that the unintended model behaviors are eliminated. However, such a regularization technique lacks flexibility and coverage, since only importance scores towards a pre-defined list of features are adjusted, while more complex human knowledge such as feature interaction and pattern generalization can hardly be incorporated. In this work, we propose to refine a learned language model for a target domain by collecting human-provided compositional explanations regarding observed biases. By parsing these explanations into executable logic rules, the human-specified refinement advice from a small set of explanations can be generalized to more training examples. We additionally introduce a regularization term allowing adjustments for both importance and interaction of features to better rectify model behavior. We demonstrate the effectiveness of the proposed approach on two text classification tasks by showing improved performance in target domain as well as improved model fairness after refinement.