Hasil untuk "cs.CL"

Menampilkan 20 dari ~154638 hasil · dari DOAJ, arXiv, CrossRef

JSON API
arXiv Open Access 2025
ClonEval: An Open Voice Cloning Benchmark

Iwona Christop, Tomasz Kuczyński, Marek Kubis

We present a novel benchmark for voice cloning text-to-speech models. The benchmark consists of an evaluation protocol, an open-source library for assessing the performance of voice cloning models, and an accompanying leaderboard. The paper discusses design considerations and presents a detailed description of the evaluation procedure. The usage of the software library is explained, along with the organization of results on the leaderboard.

en cs.CL
CrossRef Open Access 2023
First-principles predictions of enhanced thermoelectric properties for Cs<sub>2</sub>SnI<sub>2</sub>Cl<sub>2</sub> and Cs<sub>2</sub>PbI<sub>2</sub>Cl<sub>2</sub> monolayers with spin–orbit coupling

Jiajia Fei, Xiaojiao Zhang, Jialin Li et al.

Abstract Inspired by the exceptional charge transport properties and ultra-low thermal conductivity of halide perovskite, we investigate the electronic nature, thermal transport, and thermoelectric properties for Ruddlesden–Popper all-inorganic perovskite, Cs2SnI2Cl2 and Cs2PbI2Cl2 monolayers, using first-principles calculations. During the calculations, spin–orbit coupling has been considered for electronic transport as well as thermoelectric properties. The results show that the Cs2SnI2Cl2 and Cs2PbI2Cl2 monolayers exhibit high carrier mobility and low thermal conductivity. Stronger phonon–phonon interaction is responsible for the fact that thermal conductivity of Cs2SnI2Cl2 monolayer is much lower than that of Cs2PbI2Cl2 monolayer. At 700 K, the values of the figure of merit (ZT) for the n-type doped Cs2SnI2Cl2 and Cs2PbI2Cl2 monolayers are about 1.05 and 0.32 at the optimized carrier concentrations 5.42 × 1012 cm−2 and 9.84 × 1012 cm−2. Moreover, when spin–orbit coupling is considered, the corresponding ZT values are enhanced to 2.73 and 1.98 at 5.27 × 1011 cm−2 and 6.16 × 1011 cm−2. These results signify that Cs2SnI2Cl2 and Cs2PbI2Cl2 monolayers are promising thermoelectric candidates.

4 sitasi en
arXiv Open Access 2023
SIGMORPHON 2023 Shared Task of Interlinear Glossing: Baseline Model

Michael Ginn

Language documentation is a critical aspect of language preservation, often including the creation of Interlinear Glossed Text (IGT). Creating IGT is time-consuming and tedious, and automating the process can save valuable annotator effort. This paper describes the baseline system for the SIGMORPHON 2023 Shared Task of Interlinear Glossing. In our system, we utilize a transformer architecture and treat gloss generation as a sequence labelling task.

en cs.CL
arXiv Open Access 2022
A dynamic programming algorithm for span-based nested named-entity recognition in O(n^2)

Caio Corro

Span-based nested named-entity recognition (NER) has a cubic-time complexity using a variant of the CYK algorithm. We show that by adding a supplementary structural constraint on the search space, nested NER has a quadratic-time complexity, that is the same asymptotic complexity than the non-nested case. The proposed algorithm covers a large part of three standard English benchmarks and delivers comparable experimental results.

en cs.CL
arXiv Open Access 2021
Fixing exposure bias with imitation learning needs powerful oracles

Luca Hormann, Artem Sokolov

We apply imitation learning (IL) to tackle the NMT exposure bias problem with error-correcting oracles, and evaluate an SMT lattice-based oracle which, despite its excellent performance in an unconstrained oracle translation task, turned out to be too pruned and idiosyncratic to serve as the oracle for IL.

en cs.CL, cs.LG
arXiv Open Access 2020
A Survey of Neural Networks and Formal Languages

Joshua Ackerman, George Cybenko

This report is a survey of the relationships between various state-of-the-art neural network architectures and formal languages as, for example, structured by the Chomsky Language Hierarchy. Of particular interest are the abilities of a neural architecture to represent, recognize and generate words from a specific language by learning from samples of the language.

en cs.CL
arXiv Open Access 2019
Fine-tune BERT for Extractive Summarization

Yang Liu

BERT, a pre-trained Transformer model, has achieved ground-breaking performance on multiple NLP tasks. In this paper, we describe BERTSUM, a simple variant of BERT, for extractive summarization. Our system is the state of the art on the CNN/Dailymail dataset, outperforming the previous best-performed system by 1.65 on ROUGE-L. The codes to reproduce our results are available at https://github.com/nlpyang/BertSum

en cs.CL
arXiv Open Access 2018
The Evolution of Popularity and Images of Characters in Marvel Cinematic Universe Fanfictions

Fan Bu

This analysis proposes a new topic model to study the yearly trends in Marvel Cinematic Universe fanfictions on three levels: character popularity, character images/topics, and vocabulary pattern of topics. It is found that character appearances in fanfictions have become more diverse over the years thanks to constant introduction of new characters in feature films, and in the case of Captain America, multi-dimensional character development is well-received by the fanfiction world.

en cs.CL
arXiv Open Access 2018
Extrapolation in NLP

Jeff Mitchell, Pasquale Minervini, Pontus Stenetorp et al.

We argue that extrapolation to examples outside the training space will often be easier for models that capture global structures, rather than just maximise their local fit to the training data. We show that this is true for two popular models: the Decomposable Attention Model and word2vec.

en cs.CL
arXiv Open Access 2018
A case for deep learning in semantics

Christopher Potts

Pater's target article builds a persuasive case for establishing stronger ties between theoretical linguistics and connectionism (deep learning). This commentary extends his arguments to semantics, focusing in particular on issues of learning, compositionality, and lexical meaning.

en cs.CL
CrossRef Open Access 2017
[DMAPTs] <sup>+</sup> Cl <sup>−</sup> : A Promising Versatile Regioselective Tosyl Transfer Reagent

Chiranjeevi Donthulachitti, Sridhar Reddy Kothakapu, Ravi Kumar Shekunti et al.

Abstract Regioselective monotosylation, for the first time on 2,4‐ syn & 2,4‐ anti configured ene‐tetrols and on diols & triols was achieved in excellent yields and selectivity by employing [DMAPTs] + Cl − , 1 . We also proposed and developed operational flexible preferential regio‐complimentary tosylation of cis ‐1,2‐diol moiety of pyranosides by employing 1 . Further NMR correlation for unambiguous structure determination of regioselectivity and also preferred conformations of these pyranosides were established.

17 sitasi en
arXiv Open Access 2017
Convolutional Attention-based Seq2Seq Neural Network for End-to-End ASR

Dan Lim

This thesis introduces the sequence to sequence model with Luong's attention mechanism for end-to-end ASR. It also describes various neural network algorithms including Batch normalization, Dropout and Residual network which constitute the convolutional attention-based seq2seq neural network. Finally the proposed model proved its effectiveness for speech recognition achieving 15.8% phoneme error rate on TIMIT dataset.

en cs.CL
arXiv Open Access 2017
Build Fast and Accurate Lemmatization for Arabic

Hamdy Mubarak

In this paper we describe the complexity of building a lemmatizer for Arabic which has a rich and complex derivational morphology, and we discuss the need for a fast and accurate lammatization to enhance Arabic Information Retrieval (IR) results. We also introduce a new data set that can be used to test lemmatization accuracy, and an efficient lemmatization algorithm that outperforms state-of-the-art Arabic lemmatization in terms of accuracy and speed. We share the data set and the code for public.

en cs.CL
arXiv Open Access 2017
Job Detection in Twitter

Besat Kassaie

In this report, we propose a new application for twitter data called \textit{job detection}. We identify people's job category based on their tweets. As a preliminary work, we limited our task to identify only IT workers from other job holders. We have used and compared both simple bag of words model and a document representation based on Skip-gram model. Our results show that the model based on Skip-gram, achieves a 76\% precision and 82\% recall.

en cs.CL
arXiv Open Access 2016
Synthetic Language Generation and Model Validation in BEAST2

Stuart Bradley

Generating synthetic languages aids in the testing and validation of future computational linguistic models and methods. This thesis extends the BEAST2 phylogenetic framework to add linguistic sequence generation under multiple models. The new plugin is then used to test the effects of the phenomena of word borrowing on the inference process under two widely used phylolinguistic models.

en cs.CL
arXiv Open Access 2011
Why is language well-designed for communication? (Commentary on Christiansen and Chater: 'Language as shaped by the brain')

Jean-Louis Dessalles

Selection through iterated learning explains no more than other non-functional accounts, such as universal grammar, why language is so well-designed for communicative efficiency. It does not predict several distinctive features of language like central embedding, large lexicons or the lack of iconicity, that seem to serve communication purposes at the expense of learnability.

en cs.CL, q-bio.NC
CrossRef Open Access 2009
ChemInform Abstract: A Complex Cesiumchloride: Cs<sub>5</sub>[AgCl<sub>2</sub>][CoCl<sub>4</sub>]Cl<sub>2</sub>.

Oliver Fastje, Angela Moeller

AbstractChemInform is a weekly Abstracting Service, delivering concise information at a glance that was extracted from about 200 leading journals. To access a ChemInform Abstract of an article which was published elsewhere, please select a “Full Text” option. The original article is trackable via the “References” option.

Halaman 7 dari 7732