Hasil untuk "General Works"

Menampilkan 20 dari ~9790065 hasil · dari CrossRef, DOAJ, arXiv, Semantic Scholar

JSON API
S2 Open Access 2016
A General-Purpose Machine Learning Framework for Predicting Properties of Inorganic Materials

Logan T. Ward, Ankit Agrawal, A. Choudhary et al.

A very active area of materials research is to devise methods that use machine learning to automatically extract predictive models from existing materials data. While prior examples have demonstrated successful models for some applications, many more applications exist where machine learning can make a strong impact. To enable faster development of machine-learning-based models for such applications, we have created a framework capable of being applied to a broad range of materials data. Our method works by using a chemically diverse list of attributes, which we demonstrate are suitable for describing a wide variety of properties, and a novel method for partitioning the data set into groups of similar materials in order to boost the predictive accuracy. In this manuscript, we demonstrate how this new method can be used to predict diverse properties of crystalline and amorphous materials, such as band gap energy and glass-forming ability.

1480 sitasi en Materials Science, Physics
S2 Open Access 2015
Ranking treatments in frequentist network meta-analysis works without resampling methods

G. Rücker, G. Schwarzer

BackgroundNetwork meta-analysis is used to compare three or more treatments for the same condition. Within a Bayesian framework, for each treatment the probability of being best, or, more general, the probability that it has a certain rank can be derived from the posterior distributions of all treatments. The treatments can then be ranked by the surface under the cumulative ranking curve (SUCRA). For comparing treatments in a network meta-analysis, we propose a frequentist analogue to SUCRA which we call P-score that works without resampling.MethodsP-scores are based solely on the point estimates and standard errors of the frequentist network meta-analysis estimates under normality assumption and can easily be calculated as means of one-sided p-values. They measure the mean extent of certainty that a treatment is better than the competing treatments.ResultsUsing case studies of network meta-analysis in diabetes and depression, we demonstrate that the numerical values of SUCRA and P-Score are nearly identical.ConclusionsRanking treatments in frequentist network meta-analysis works without resampling. Like the SUCRA values, P-scores induce a ranking of all treatments that mostly follows that of the point estimates, but takes precision into account. However, neither SUCRA nor P-score offer a major advantage compared to looking at credible or confidence intervals.

1338 sitasi en Medicine
S2 Open Access 2006
Why phishing works

Rachna Dhamija, J. D. Tygar, Marti A. Hearst

To build systems shielding users from fraudulent (or phishing) websites, designers need to know which attack strategies work and why. This paper provides the first empirical evidence about which malicious strategies are successful at deceiving general users. We first analyzed a large set of captured phishing attacks and developed a set of hypotheses about why these strategies might work. We then assessed these hypotheses with a usability study in which 22 participants were shown 20 web sites and asked to determine which ones were fraudulent. We found that 23% of the participants did not look at browser-based cues such as the address bar, status bar and the security indicators, leading to incorrect choices 40% of the time. We also found that some visual deception attacks can fool even the most sophisticated users. These results illustrate that standard security indicators are not effective for a substantial fraction of users, and suggest that alternative approaches are needed.

1569 sitasi en Computer Science
S2 Open Access 2021
Dual Contrastive Learning for General Face Forgery Detection

Ke Sun, Taiping Yao, Shen Chen et al.

With various facial manipulation techniques arising, face forgery detection has drawn growing attention due to security concerns. Previous works always formulate face forgery detection as a classification problem based on cross-entropy loss, which emphasizes category-level differences rather than the essential discrepancies between real and fake faces, limiting model generalization in unseen domains. To address this issue, we propose a novel face forgery detection framework, named Dual Contrastive Learning (DCL), which specially constructs positive and negative paired data and performs designed contrastive learning at different granularities to learn generalized feature representation. Concretely, combined with the hard sample selection strategy, Inter-Instance Contrastive Learning (Inter-ICL) is first proposed to promote task-related discriminative features learning by especially constructing instance pairs. Moreover, to further explore the essential discrepancies, Intra-Instance Contrastive Learning (Intra-ICL) is introduced to focus on the local content inconsistencies prevalent in the forged faces by constructing local region pairs inside instances. Extensive experiments and visualizations on several datasets demonstrate the generalization of our method against the state-of-the-art competitors. Our Code is available at https://github.com/Tencent/TFace.git.

236 sitasi en Computer Science
S2 Open Access 2021
General Instance Distillation for Object Detection

Xing Dai, Jiang, Zhao Wu et al.

In recent years, knowledge distillation has been proved to be an effective solution for model compression. This approach can make lightweight student models acquire the knowledge extracted from cumbersome teacher models. However, previous distillation methods of detection have weak generalization for different detection frameworks and rely heavily on ground truth (GT), ignoring the valuable relation information between instances. Thus, we propose a novel distillation method for detection tasks based on discriminative instances without considering the positive or negative distinguished by GT, which is called general instance distillation (GID). Our approach contains a general instance selection module (GISM) to make full use of feature-based, relation-based and response-based knowledge for distillation. Extensive results demonstrate that the student model achieves significant AP improvement and even outperforms the teacher in various detection frame-works. Specifically, RetinaNet with ResNet-50 achieves 39.1% in mAP with GID on COCO dataset, which surpasses the baseline 36.2% by 2.9%, and even better than the ResNet-101 based teacher model with 38.1% AP.

232 sitasi en Computer Science
S2 Open Access 2022
General Cutting Planes for Bound-Propagation-Based Neural Network Verification

Huan Zhang, Shiqi Wang, Kaidi Xu et al.

Bound propagation methods, when combined with branch and bound, are among the most effective methods to formally verify properties of deep neural networks such as correctness, robustness, and safety. However, existing works cannot handle the general form of cutting plane constraints widely accepted in traditional solvers, which are crucial for strengthening verifiers with tightened convex relaxations. In this paper, we generalize the bound propagation procedure to allow the addition of arbitrary cutting plane constraints, including those involving relaxed integer variables that do not appear in existing bound propagation formulations. Our generalized bound propagation method, GCP-CROWN, opens up the opportunity to apply general cutting plane methods for neural network verification while benefiting from the efficiency and GPU acceleration of bound propagation methods. As a case study, we investigate the use of cutting planes generated by off-the-shelf mixed integer programming (MIP) solver. We find that MIP solvers can generate high-quality cutting planes for strengthening bound-propagation-based verifiers using our new formulation. Since the branching-focused bound propagation procedure and the cutting-plane-focused MIP solver can run in parallel utilizing different types of hardware (GPUs and CPUs), their combination can quickly explore a large number of branches with strong cutting planes, leading to strong verification performance. Experiments demonstrate that our method is the first verifier that can completely solve the oval20 benchmark and verify twice as many instances on the oval21 benchmark compared to the best tool in VNN-COMP 2021, and also noticeably outperforms state-of-the-art verifiers on a wide range of benchmarks. GCP-CROWN is part of the $\alpha,\!\beta$-CROWN verifier, the VNN-COMP 2022 winner. Code is available at http://PaperCode.cc/GCP-CROWN

133 sitasi en Computer Science, Mathematics
S2 Open Access 2022
Soft Diffusion: Score Matching for General Corruptions

G. Daras, M. Delbracio, Hossein Talebi et al.

We define a broader family of corruption processes that generalizes previously known diffusion models. To reverse these general diffusions, we propose a new objective called Soft Score Matching that provably learns the score function for any linear corruption process and yields state of the art results for CelebA. Soft Score Matching incorporates the degradation process in the network. Our new loss trains the model to predict a clean image, \textit{that after corruption}, matches the diffused observation. We show that our objective learns the gradient of the likelihood under suitable regularity conditions for a family of corruption processes. We further develop a principled way to select the corruption levels for general diffusion processes and a novel sampling method that we call Momentum Sampler. We show experimentally that our framework works for general linear corruption processes, such as Gaussian blur and masking. We achieve state-of-the-art FID score $1.85$ on CelebA-64, outperforming all previous linear diffusion models. We also show significant computational benefits compared to vanilla denoising diffusion.

126 sitasi en Computer Science
S2 Open Access 2020
Knowledge-Guided Multi-Label Few-Shot Learning for General Image Recognition

Tianshui Chen, Liang Lin, Riquan Chen et al.

Recognizing multiple labels of an image is a practical yet challenging task, and remarkable progress has been achieved by searching for semantic regions and exploiting label dependencies. However, current works utilize RNN/LSTM to implicitly capture sequential region/label dependencies, which cannot fully explore mutual interactions among the semantic regions/labels and do not explicitly integrate label co-occurrences. In addition, these works require large amounts of training samples for each category, and they are unable to generalize to novel categories with limited samples. To address these issues, we propose a knowledge-guided graph routing (KGGR) framework, which unifies prior knowledge of statistical label correlations with deep neural networks. The framework exploits prior knowledge to guide adaptive information propagation among different categories to facilitate multi-label analysis and reduce the dependency of training samples. Specifically, it first builds a structured knowledge graph to correlate different labels based on statistical label co-occurrence. Then, it introduces the label semantics to guide learning semantic-specific features to initialize the graph, and it exploits a graph propagation network to explore graph node interactions, enabling learning contextualized image feature representations. Moreover, we initialize each graph node with the classifier weights for the corresponding label and apply another propagation network to transfer node messages through the graph. In this way, it can facilitate exploiting the information of correlated labels to help train better classifiers, especially for labels with limited training samples. We conduct extensive experiments on the traditional multi-label image recognition (MLR) and multi-label few-shot learning (ML-FSL) tasks and show that our KGGR framework outperforms the current state-of-the-art methods by sizable margins on the public benchmarks.

192 sitasi en Computer Science, Medicine
S2 Open Access 2023
Learning Linear Causal Representations from Interventions under General Nonlinear Mixing

Simon Buchholz, Goutham Rajendran, Elan Rosenfeld et al.

We study the problem of learning causal representations from unknown, latent interventions in a general setting, where the latent distribution is Gaussian but the mixing function is completely general. We prove strong identifiability results given unknown single-node interventions, i.e., without having access to the intervention targets. This generalizes prior works which have focused on weaker classes, such as linear maps or paired counterfactual data. This is also the first instance of causal identifiability from non-paired interventions for deep neural network embeddings. Our proof relies on carefully uncovering the high-dimensional geometric structure present in the data distribution after a non-linear density transformation, which we capture by analyzing quadratic forms of precision matrices of the latent distributions. Finally, we propose a contrastive algorithm to identify the latent variables in practice and evaluate its performance on various tasks.

90 sitasi en Computer Science, Mathematics
S2 Open Access 2021
When Problem Solving Followed by Instruction Works: Evidence for Productive Failure

Tanmay Sinha, Manu Kapur

When learning a new concept, should students engage in problem solving followed by instruction (PS-I) or instruction followed by problem solving (I-PS)? Noting that there is a passionate debate about the design of initial learning, we report evidence from a meta-analysis of 53 studies with 166 comparisons that compared PS-I with I-PS design. Our results showed a significant, moderate effect in favor of PS-I (Hedge’s g 0.36 [95% confidence interval 0.20; 0.51]). The effects were even stronger (Hedge’s g ranging between 0.37 and 0.58) when PS-I was implemented with high fidelity to the principles of Productive Failure (PF), a subset variant of PS-I design. Students’ grade level, intervention time span, and its (quasi-)experimental nature contributed to the efficacy of PS-I over I-PS designs. Contrasting trends were, however, observed for younger age learners (second to fifth graders) and for the learning of domain-general skills, for which effect sizes favored I-PS. Overall, an estimation of true effect sizes after accounting for publication bias suggested a strong effect size favoring PS-I (Hedge’s g 0.87).

147 sitasi en
arXiv Open Access 2025
Safe Reinforcement Learning-based Automatic Generation Control

Amr S. Mohamed, Emily Nguyen, Deepa Kundur

Amidst the growing demand for implementing advanced control and decision-making algorithms|to enhance the reliability, resilience, and stability of power systems|arises a crucial concern regarding the safety of employing machine learning techniques. While these methods can be applied to derive more optimal control decisions, they often lack safety assurances. This paper proposes a framework based on control barrier functions to facilitate safe learning and deployment of reinforcement learning agents for power system control applications, specifically in the context of automatic generation control. We develop the safety barriers and reinforcement learning framework necessary to establish trust in reinforcement learning as a safe option for automatic generation control - as foundation for future detailed verification and application studies.

en eess.SY
S2 Open Access 2020
Variational Policy Gradient Method for Reinforcement Learning with General Utilities

Junyu Zhang, Alec Koppel, A. S. Bedi et al.

In recent years, reinforcement learning (RL) systems with general goals beyond a cumulative sum of rewards have gained traction, such as in constrained problems, exploration, and acting upon prior experiences. In this paper, we consider policy optimization in Markov Decision Problems, where the objective is a general concave utility function of the state-action occupancy measure, which subsumes several of the aforementioned examples as special cases. Such generality invalidates the Bellman equation. As this means that dynamic programming no longer works, we focus on direct policy search. Analogously to the Policy Gradient Theorem \cite{sutton2000policy} available for RL with cumulative rewards, we derive a new Variational Policy Gradient Theorem for RL with general utilities, which establishes that the parametrized policy gradient may be obtained as the solution of a stochastic saddle point problem involving the Fenchel dual of the utility function. We develop a variational Monte Carlo gradient estimation algorithm to compute the policy gradient based on sample paths. We prove that the variational policy gradient scheme converges globally to the optimal policy for the general objective, though the optimization problem is nonconvex. We also establish its rate of convergence of the order $O(1/t)$ by exploiting the hidden convexity of the problem, and proves that it converges exponentially when the problem admits hidden strong convexity. Our analysis applies to the standard RL problem with cumulative rewards as a special case, in which case our result improves the available convergence rate.

155 sitasi en Computer Science, Mathematics
S2 Open Access 2021
Multi-Task Self-Training for Learning General Representations

Golnaz Ghiasi, Barret Zoph, E. D. Cubuk et al.

Despite the fast progress in training specialized models for various tasks, learning a single general model that works well for many tasks is still challenging for computer vision. Here we introduce multi-task self-training (MuST), which harnesses the knowledge in independent specialized teacher models (e.g., ImageNet model on classification) to train a single general student model. Our approach has three steps. First, we train specialized teachers independently on labeled datasets. We then use the specialized teachers to label an unlabeled dataset to create a multi-task pseudo labeled dataset. Finally, the dataset, which now contains pseudo labels from teacher models trained on different datasets/tasks, is then used to train a student model with multi-task learning. We evaluate the feature representations of the student model on 6 vision tasks including image recognition (classification, detection, segmentation) and 3D geometry estimation (depth and surface normal estimation). MuST is scalable with unlabeled or partially labeled datasets and outperforms both specialized supervised models and self-supervised models when training on large scale datasets. Lastly, we show MuST can improve upon already strong checkpoints [23] trained with billions of examples. The results suggest self-training is a promising direction to aggregate labeled and unlabeled training data for learning general feature representations.

118 sitasi en Computer Science
S2 Open Access 2022
Mousika: Enable General In-Network Intelligence in Programmable Switches by Knowledge Distillation

Guorui Xie, Qing Li, Yutao Dong et al.

Given the power efficiency and Tbps throughput of packet processing, several works are proposed to offload the decision tree (DT) to programmable switches, i.e., in-network intelligence. Though the DT is suitable for the switches’ match-action paradigm, it has several limitations. E.g., its range match rules may not be supported well due to the hardware diversity; and its implementation also consumes lots of switch resources (e.g., stages and memory). Moreover, as learning algorithms (particularly deep learning) have shown their superior performance, some more complicated learning models are emerging for networking. However, their high computational complexity and large storage requirement are cause challenges in the deployment on switches. Therefore, we propose Mousika, an in-network intelligence framework that addresses these drawbacks successfully. First, we modify the DT to the Binary Decision Tree (BDT). Compared with the DT, our BDT supports faster training, generates fewer rules, and satisfies switch constraints better. Second, we introduce the teacher-student knowledge distillation in Mousika, which enables the general translation from other learning models to the BDT. Through the translation, we can not only utilize the super learning capabilities of complicated models, but also avoid the computation/memory constraints when deploying them on switches directly for line-speed processing.

81 sitasi en Computer Science

Halaman 2 dari 489504