Hasil untuk "General works"

Menampilkan 20 dari ~9792961 hasil · dari DOAJ, arXiv, CrossRef, Semantic Scholar

JSON API
S2 Open Access 2021
Real-ESRGAN: Training Real-World Blind Super-Resolution with Pure Synthetic Data

Xintao Wang, Liangbin Xie, Chao Dong et al.

Though many attempts have been made in blind super-resolution to restore low-resolution images with unknown and complex degradations, they are still far from addressing general real-world degraded images. In this work, we extend the powerful ESRGAN to a practical restoration application (namely, Real-ESRGAN), which is trained with pure synthetic data. Specifically, a high-order degradation modeling process is introduced to better simulate complex real-world degradations. We also consider the common ringing and overshoot artifacts in the synthesis process. In addition, we employ a U-Net discriminator with spectral normalization to increase discriminator capability and stabilize the training dynamics. Extensive comparisons have shown its superior visual performance than prior works on various real datasets. We also provide efficient implementations to synthesize training pairs on the fly.

1780 sitasi en Engineering, Computer Science
S2 Open Access 2021
MetaFormer is Actually What You Need for Vision

Weihao Yu, Romy Mi Luo, Pan Zhou et al.

Transformers have shown great potential in computer vision tasks. A common belief is their attention-based token mixer module contributes most to their competence. However, recent works show the attention-based module in transformers can be replaced by spatial MLPs and the resulted models still perform quite well. Based on this observation, we hypothesize that the general architecture of the transformers, instead of the specific token mixer module, is more essential to the model's performance. To verify this, we deliberately replace the attention module in transformers with an embarrassingly simple spatial pooling operator to conduct only basic token mixing. Surprisingly, we observe that the derived model, termed as PoolFormer, achieves competitive performance on multiple computer vision tasks. For example, on ImageNet-1K, PoolFormer achieves 82.1 % top-1 accuracy, surpassing well-tuned vision transformer/MLP-like baselines DeiT-B/ResMLP-B24 by 0.3%/1.1% accuracy with 35%/52% fewer parameters and 49%/61% fewer MACs. The effectiveness of Pool-Former verifies our hypothesis and urges us to initiate the concept of “MetaFormer”, a general architecture abstracted from transformers without specifying the token mixer. Based on the extensive experiments, we argue that MetaFormer is the key player in achieving superior results for recent transformer and MLP-like models on vision tasks. This work calls for more future research dedicated to improving MetaFormer instead of focusing on the token mixer modules. Additionally, our proposed PoolFormer could serve as a starting baseline for future MetaFormer architecture design.

1296 sitasi en Computer Science
S2 Open Access 2019
Devign: Effective Vulnerability Identification by Learning Comprehensive Program Semantics via Graph Neural Networks

Yaqin Zhou, Shangqing Liu, J. Siow et al.

Vulnerability identification is crucial to protect the software systems from attacks for cyber security. It is especially important to localize the vulnerable functions among the source code to facilitate the fix. However, it is a challenging and tedious process, and also requires specialized security expertise. Inspired by the work on manually-defined patterns of vulnerabilities from various code representation graphs and the recent advance on graph neural networks, we propose Devign, a general graph neural network based model for graph-level classification through learning on a rich set of code semantic representations. It includes a novel Conv module to efficiently extract useful features in the learned rich node representations for graph-level classification. The model is trained over manually labeled datasets built on 4 diversified large-scale open-source C projects that incorporate high complexity and variety of real source code instead of synthesis code used in previous works. The results of the extensive evaluation on the datasets demonstrate that Devign outperforms the state of the arts significantly with an average of 10.51% higher accuracy and 8.68% F1 score, increases averagely 4.66% accuracy and 6.37% F1 by the Conv module.

1058 sitasi en Computer Science, Mathematics
S2 Open Access 2019
PointPainting: Sequential Fusion for 3D Object Detection

Sourabh Vora, Alex H. Lang, Bassam Helou et al.

Camera and lidar are important sensor modalities for robotics in general and self-driving cars in particular. The sensors provide complementary information offering an opportunity for tight sensor-fusion. Surprisingly, lidar-only methods outperform fusion methods on the main benchmark datasets, suggesting a gap in the literature. In this work, we propose PointPainting: a sequential fusion method to fill this gap. PointPainting works by projecting lidar points into the output of an image-only semantic segmentation network and appending the class scores to each point. The appended (painted) point cloud can then be fed to any lidar-only method. Experiments show large improvements on three different state-of-the art methods, Point-RCNN, VoxelNet and PointPillars on the KITTI and nuScenes datasets. The painted version of PointRCNN represents a new state of the art on the KITTI leaderboard for the bird's-eye view detection task. In ablation, we study how the effects of Painting depends on the quality and format of the semantic segmentation output, and demonstrate how latency can be minimized through pipelining.

1022 sitasi en Computer Science, Engineering
S2 Open Access 2019
MelGAN: Generative Adversarial Networks for Conditional Waveform Synthesis

Kundan Kumar, Rithesh Kumar, T. Boissiere et al.

Previous works (Donahue et al., 2018a; Engel et al., 2019a) have found that generating coherent raw audio waveforms with GANs is challenging. In this paper, we show that it is possible to train GANs reliably to generate high quality coherent waveforms by introducing a set of architectural changes and simple training techniques. Subjective evaluation metric (Mean Opinion Score, or MOS) shows the effectiveness of the proposed approach for high quality mel-spectrogram inversion. To establish the generality of the proposed techniques, we show qualitative results of our model in speech synthesis, music domain translation and unconditional music synthesis. We evaluate the various components of the model through ablation studies and suggest a set of guidelines to design general purpose discriminators and generators for conditional sequence synthesis tasks. Our model is non-autoregressive, fully convolutional, with significantly fewer parameters than competing models and generalizes to unseen speakers for mel-spectrogram inversion. Our pytorch implementation runs at more than 100x faster than realtime on GTX 1080Ti GPU and more than 2x faster than real-time on CPU, without any hardware specific optimization tricks.

1108 sitasi en Computer Science, Engineering
S2 Open Access 2017
An Overview of Multi-Task Learning in Deep Neural Networks

Sebastian Ruder

Multi-task learning (MTL) has led to successes in many applications of machine learning, from natural language processing and speech recognition to computer vision and drug discovery. This article aims to give a general overview of MTL, particularly in deep neural networks. It introduces the two most common methods for MTL in Deep Learning, gives an overview of the literature, and discusses recent advances. In particular, it seeks to help ML practitioners apply MTL by shedding light on how MTL works and providing guidelines for choosing appropriate auxiliary tasks.

3166 sitasi en Computer Science, Mathematics
S2 Open Access 2017
Interpretable Explanations of Black Boxes by Meaningful Perturbation

Ruth C. Fong, A. Vedaldi

As machine learning algorithms are increasingly applied to high impact yet high risk tasks, such as medical diagnosis or autonomous driving, it is critical that researchers can explain how such algorithms arrived at their predictions. In recent years, a number of image saliency methods have been developed to summarize where highly complex neural networks “look” in an image for evidence for their predictions. However, these techniques are limited by their heuristic nature and architectural constraints. In this paper, we make two main contributions: First, we propose a general framework for learning different kinds of explanations for any black box algorithm. Second, we specialise the framework to find the part of an image most responsible for a classifier decision. Unlike previous works, our method is model-agnostic and testable because it is grounded in explicit and interpretable image perturbations.

1666 sitasi en Computer Science, Mathematics
DOAJ Open Access 2025
A História mostra-nos que não temos de escolher o pior

José Bragança de Miranda, Carlos Camponez, José Gomes Pinto

Doutorado em Ciências da Comunicação e com a agregação em Teoria da Cultura pela Universidade Nova de Lisboa, José Bragança de Miranda, em entrevista à Biblos, reflete sobre o tema da liberdade, cruzando a Comunicação, a Filosofia, a História e a Política. Refratário às lógicas monistas do pensamento e das organizações, o atual reitor da Universidade Lusófona considera que, a ter existido, o fim da História abriu-se com as Revoluções que conduziram ao controlo do poder absoluto. Reconhecendo que o controlo desse poder não é uma conquista definitiva, José Bragança de Miranda defende, no entanto, que a História mostra que a escolha do pior não se apresenta como uma inevitabilidade para a Humanidade. Por isso, mais do que discutir conceitos como a Liberdade, considera que é importante fazer uso deles, atualizando em cada ínfimo presente o legado de muitos heróis do passado.

History of scholarship and learning. The humanities
DOAJ Open Access 2025
The Impact of Entrepreneurial Competence on Entrepreneurial Performance of Family Farms: A Comprehensive Research Framework of “Competence – Legitimacy – Performance”

Xiaofeng Su, Xiaoli Jiang, Anxin Xu

In China, the implementation of the rural revitalization strategy provides a broad stage for migrant workers to return home to start their own businesses. This study constructs a research framework of “competence – legitimacy – performance.” Through online and offline surveys, this study obtained 477 valid samples from new family farm entrepreneurs in Fujian province of China. By using structural equation model, this study explores the relationship between entrepreneurial competence and family farm entrepreneurial performance. The empirical analysis results show that all the five dimensions of family farm entrepreneurs’ entrepreneurial competence, namely, opportunity recognition competence, network competence, resource acquisition competence, entrepreneurial learning competence, and improvisational competence have positive impacts on family farm entrepreneurial performance. And organizational legitimacy also has a positive impact on family farm entrepreneurial performance. In addition, the mediating effect of organizational legitimacy in opportunity recognition competence, network competence, resource acquisition competence, entrepreneurial learning competence, and family farm entrepreneurial performance are supported by data. However, organizational legitimacy does not play a significant mediating role in the relationship between improvisational competence and family farm entrepreneurial performance. The research findings provide some enlightenment and reflections to family farm entrepreneurs and policy-makers.

History of scholarship and learning. The humanities, Social Sciences

Halaman 6 dari 489649