Hasil "Manners and customs (General)"

arXiv Open Access 2025

Descriminative-Generative Custom Tokens for Vision-Language Models

Pramuditha Perera, Matthew Trager, Luca Zancato et al.

This paper explores the possibility of learning custom tokens for representing new concepts in Vision-Language Models (VLMs). Our aim is to learn tokens that can be effective for both discriminative and generative tasks while composing well with words to form new input queries. The targeted concept is specified in terms of a small set of images and a parent concept described using text. We operate on CLIP text features and propose to use a combination of a textual inversion loss and a classification loss to ensure that text features of the learned token are aligned with image features of the concept in the CLIP embedding space. We restrict the learned token to a low-dimensional subspace spanned by tokens for attributes that are appropriate for the given super-class. These modifications improve the quality of compositions of the learned token with natural language for generating new scenes. Further, we show that learned custom tokens can be used to form queries for text-to-image retrieval task, and also have the important benefit that composite queries can be visualized to ensure that the desired concept is faithfully encoded. Based on this, we introduce the method of Generation Aided Image Retrieval, where the query is modified at inference time to better suit the search intent. On the DeepFashion2 dataset, our method improves Mean Reciprocal Retrieval (MRR) over relevant baselines by 7%.

en cs.CV

Detail Sumber

arXiv Open Access 2025

TransLight: Image-Guided Customized Lighting Control with Generative Decoupling

Zongming Li, Lianghui Zhu, Haocheng Shen et al.

Most existing illumination-editing approaches fail to simultaneously provide customized control of light effects and preserve content integrity. This makes them less effective for practical lighting stylization requirements, especially in the challenging task of transferring complex light effects from a reference image to a user-specified target image. To address this problem, we propose TransLight, a novel framework that enables high-fidelity and high-freedom transfer of light effects. Extracting the light effect from the reference image is the most critical and challenging step in our method. The difficulty lies in the complex geometric structure features embedded in light effects that are highly coupled with content in real-world scenarios. To achieve this, we first present Generative Decoupling, where two fine-tuned diffusion models are used to accurately separate image content and light effects, generating a newly curated, million-scale dataset of image-content-light triplets. Then, we employ IC-Light as the generative model and train our model with our triplets, injecting the reference lighting image as an additional conditioning signal. The resulting TransLight model enables customized and natural transfer of diverse light effects. Notably, by thoroughly disentangling light effects from reference images, our generative decoupling strategy endows TransLight with highly flexible illumination control. Experimental results establish TransLight as the first method to successfully transfer light effects across disparate images, delivering more customized illumination control than existing techniques and charting new directions for research in illumination harmonization and editing.

en cs.CV, cs.AI

Detail Sumber

arXiv Open Access 2024

Customizing Large Language Model Generation Style using Parameter-Efficient Finetuning

Xinyue Liu, Harshita Diddee, Daphne Ippolito

One-size-fits-all large language models (LLMs) are increasingly being used to help people with their writing. However, the style these models are trained to write in may not suit all users or use cases. LLMs would be more useful as writing assistants if their idiolect could be customized to match each user. In this paper, we explore whether parameter-efficient finetuning (PEFT) with Low-Rank Adaptation can effectively guide the style of LLM generations. We use this method to customize LLaMA-2 to ten different authors and show that the generated text has lexical, syntactic, and surface alignment with the target author but struggles with content memorization. Our findings highlight the potential of PEFT to support efficient, user-level customization of LLMs.

en cs.CL

Detail Sumber

arXiv Open Access 2024

TV-3DG: Mastering Text-to-3D Customized Generation with Visual Prompt

Jiahui Yang, Donglin Di, Baorui Ma et al.

In recent years, advancements in generative models have significantly expanded the capabilities of text-to-3D generation. Many approaches rely on Score Distillation Sampling (SDS) technology. However, SDS struggles to accommodate multi-condition inputs, such as text and visual prompts, in customized generation tasks. To explore the core reasons, we decompose SDS into a difference term and a classifier-free guidance term. Our analysis identifies the core issue as arising from the difference term and the random noise addition during the optimization process, both contributing to deviations from the target mode during distillation. To address this, we propose a novel algorithm, Classifier Score Matching (CSM), which removes the difference term in SDS and uses a deterministic noise addition process to reduce noise during optimization, effectively overcoming the low-quality limitations of SDS in our customized generation framework. Based on CSM, we integrate visual prompt information with an attention fusion mechanism and sampling guidance techniques, forming the Visual Prompt CSM (VPCSM) algorithm. Furthermore, we introduce a Semantic-Geometry Calibration (SGC) module to enhance quality through improved textual information integration. We present our approach as TV-3DG, with extensive experiments demonstrating its capability to achieve stable, high-quality, customized 3D generation. Project page: \url{https://yjhboy.github.io/TV-3DG}

en cs.CV

Detail Sumber

arXiv Open Access 2024

CustAny: Customizing Anything from A Single Example

Lingjie Kong, Kai Wu, Xiaobin Hu et al.

Recent advances in diffusion-based text-to-image models have simplified creating high-fidelity images, but preserving the identity (ID) of specific elements, like a personal dog, is still challenging. Object customization, using reference images and textual descriptions, is key to addressing this issue. Current object customization methods are either object-specific, requiring extensive fine-tuning, or object-agnostic, offering zero-shot customization but limited to specialized domains. The primary issue of promoting zero-shot object customization from specific domains to the general domain is to establish a large-scale general ID dataset for model pre-training, which is time-consuming and labor-intensive. In this paper, we propose a novel pipeline to construct a large dataset of general objects and build the Multi-Category ID-Consistent (MC-IDC) dataset, featuring 315k text-image samples across 10k categories. With the help of MC-IDC, we introduce Customizing Anything (CustAny), a zero-shot framework that maintains ID fidelity and supports flexible text editing for general objects. CustAny features three key components: a general ID extraction module, a dual-level ID injection module, and an ID-aware decoupling module, allowing it to customize any object from a single reference image and text prompt. Experiments demonstrate that CustAny outperforms existing methods in both general object customization and specialized domains like human customization and virtual try-on. Our contributions include a large-scale dataset, the CustAny framework and novel ID processing to advance this field. Code and dataset will be released soon in https://github.com/LingjieKong-fdu/CustAny.

en cs.CV

Detail Sumber

arXiv Open Access 2024

DisenStudio: Customized Multi-subject Text-to-Video Generation with Disentangled Spatial Control

Hong Chen, Xin Wang, Yipeng Zhang et al.

Generating customized content in videos has received increasing attention recently. However, existing works primarily focus on customized text-to-video generation for single subject, suffering from subject-missing and attribute-binding problems when the video is expected to contain multiple subjects. Furthermore, existing models struggle to assign the desired actions to the corresponding subjects (action-binding problem), failing to achieve satisfactory multi-subject generation performance. To tackle the problems, in this paper, we propose DisenStudio, a novel framework that can generate text-guided videos for customized multiple subjects, given few images for each subject. Specifically, DisenStudio enhances a pretrained diffusion-based text-to-video model with our proposed spatial-disentangled cross-attention mechanism to associate each subject with the desired action. Then the model is customized for the multiple subjects with the proposed motion-preserved disentangled finetuning, which involves three tuning strategies: multi-subject co-occurrence tuning, masked single-subject tuning, and multi-subject motion-preserved tuning. The first two strategies guarantee the subject occurrence and preserve their visual attributes, and the third strategy helps the model maintain the temporal motion-generation ability when finetuning on static images. We conduct extensive experiments to demonstrate our proposed DisenStudio significantly outperforms existing methods in various metrics. Additionally, we show that DisenStudio can be used as a powerful tool for various controllable generation applications.

en cs.CV

Detail Sumber

DOAJ Open Access 2023

A visibilidade da língua portuguesa no cenário da política linguística de Timor-Leste

Thiago Soares de Oliveira, Leiliane Rezende da Silva Silveira

Timor-Leste é um país que reúne uma ampla diversidade linguística, com duas línguas oficiais (português e tétum) e duas de trabalho (inglês e bahasa indonésio), além de diversificadas línguas nacionais. Partindo disso, este trabalho objetiva refletir sobre o atual quadro linguístico timorense e a visibilidade linguístico-identitária da língua portuguesa, considerando as línguas utilizadas no país. Para tanto, buscaram-se subsídios na pesquisa bibliográfica, de modo que se pudessem sustentar os argumentos levantados, a exemplo da presença de diferentes línguas no território, oficiais e de trabalho, mas que não inibem a presença do português em Timor-Leste. Para isso, utilizam-se como referenciais teóricos obras de Albuquerque (2011), Brito (2013), Henriques (2021) e Paulino (2022), dentre outras. Como resultados obtidos, observa-se que a língua portuguesa vem ocupando visibilidade linguística não apenas dentro do território timorense e da Comunidade dos Países de Língua Portuguesa (CPLP), mas mundial, favorecendo, em tese, o reconhecimento da nação timorense, mesmo que de forma lenta. Além disso, o inglês e a língua indonésia se destacam em Timor devido à grande quantidade de estrangeiros oriundos de países que manejam esses idiomas.

Literature (General), Manners and customs (General)

Detail DOI Sumber

DOAJ Open Access 2023

Guimarães Rosa ou o homem intuitivo nietzschiano

Laysa Beretta

Considerando a imensa capacidade de criação linguística observada nas obras rosianas e os estudos publicados desde a década de 1950 sobre os aspectos formais do texto, percorri, no presente estudo, alguns neologismos cunhados por Guimarães Rosa (1956) em Grande sertão: veredas à luz do ensaio “Verdade e mentira no sentido extra-moral” (1873), de Nietzsche. Assim, pretendi observar a construção e, sobretudo, a função do universo linguístico erigido no romance. Quero dizer: o que vale a palavra na obra de Guimarães Rosa? Qual é o valor da verdade enquanto conceito? A palavra rosiana confina um valor? E se confina, o que os neologismos pretendem confinar?

Literature (General), Manners and customs (General)

Detail DOI Sumber

arXiv Open Access 2023

Custom-Edit: Text-Guided Image Editing with Customized Diffusion Models

Jooyoung Choi, Yunjey Choi, Yunji Kim et al.

Text-to-image diffusion models can generate diverse, high-fidelity images based on user-provided text prompts. Recent research has extended these models to support text-guided image editing. While text guidance is an intuitive editing interface for users, it often fails to ensure the precise concept conveyed by users. To address this issue, we propose Custom-Edit, in which we (i) customize a diffusion model with a few reference images and then (ii) perform text-guided editing. Our key discovery is that customizing only language-relevant parameters with augmented prompts improves reference similarity significantly while maintaining source similarity. Moreover, we provide our recipe for each customization and editing process. We compare popular customization methods and validate our findings on two editing methods using various datasets.

en cs.CV

Detail Sumber

arXiv Open Access 2023

Decoupled Textual Embeddings for Customized Image Generation

Yufei Cai, Yuxiang Wei, Zhilong Ji et al.

Customized text-to-image generation, which aims to learn user-specified concepts with a few images, has drawn significant attention recently. However, existing methods usually suffer from overfitting issues and entangle the subject-unrelated information (e.g., background and pose) with the learned concept, limiting the potential to compose concept into new scenes. To address these issues, we propose the DETEX, a novel approach that learns the disentangled concept embedding for flexible customized text-to-image generation. Unlike conventional methods that learn a single concept embedding from the given images, our DETEX represents each image using multiple word embeddings during training, i.e., a learnable image-shared subject embedding and several image-specific subject-unrelated embeddings. To decouple irrelevant attributes (i.e., background and pose) from the subject embedding, we further present several attribute mappers that encode each image as several image-specific subject-unrelated embeddings. To encourage these unrelated embeddings to capture the irrelevant information, we incorporate them with corresponding attribute words and propose a joint training strategy to facilitate the disentanglement. During inference, we only use the subject embedding for image generation, while selectively using image-specific embeddings to retain image-specified attributes. Extensive experiments demonstrate that the subject embedding obtained by our method can faithfully represent the target concept, while showing superior editability compared to the state-of-the-art methods. Our code will be made published available.

en cs.CV

Detail Sumber

arXiv Open Access 2023

Improving Customer Experience in Call Centers with Intelligent Customer-Agent Pairing

S. Filippou, A. Tsiartas, P. Hadjineophytou et al.

Customer experience plays a critical role for a profitable organisation or company. A satisfied customer for a company corresponds to higher rates of customer retention, and better representation in the market. One way to improve customer experience is to optimize the functionality of its call center. In this work, we have collaborated with the largest provider of telecommunications and Internet access in the country, and we formulate the customer-agent pairing problem as a machine learning problem. The proposed learning-based method causes a significant improvement in performance of about $215\%$ compared to a rule-based method.

en cs.LG

Detail DOI Sumber

DOAJ Open Access 2022

Finisterra e os modos de povoar (ou perspectivar) uma paisagem

Gisele Seeger

Neste artigo, analisamos o complexo agenciamento discursivo da categoria perspectiva narrativa em Finisterra, de Carlos de Oliveira. Com base em pressupostos dos estudos narrativos contemporâneos, defendemos que emana desse agenciamento a configuração de todos os demais elementos estruturantes da narrativa. Por meio dele, além disso, o leitor se torna partícipe do jogo do texto, assumindo papel análogo ao das personagens, do narrador e do próprio autor recriado pelo texto.

Literature (General), Manners and customs (General)

Detail DOI Sumber

DOAJ Open Access 2022

Refúgio, exílio e hospitalidade em Agora vai ser assim, de Leonardo Tonus e Teoria da fronteira, de José Tolentino Mendonça

Keli Cristina Pacheco

A recusa da aceitação resignada da atual condição (des)humana parece gerar o ato poético de parte da produção de Leonardo Tonus e José Tolentino Mendonça. A questão do refúgio e o gesto ético humano de hospitalidade são temáticas que atravessam as obras Agora vai ser assim (2018) e Teoria da fronteira (2017), publicadas respectivamente no Brasil e em Portugal. Ambas são tocadas pela crise dos migrantes na Europa de 2015 que, como afirma Michel Agier, é muito mais que uma crise dos Estados europeus face aos imigrantes, trata-se de uma crise da representação do outro. Nesse passo, Tonus e Mendonça propõem uma saída para a inquietude do tempo em que a xenofobia assustadoramente ocupa um status de racionalidade nas práticas e políticas contemporâneas, e onde o exílio não tem provocado reparação alguma, mas prolongado o trauma em um sofrimento político de uma condição que é imposta como incerta, precária. Com base nos estudos de Alexis Nouss, Michel Agier e outros, pretendemos percorrer algumas imagens poéticas a fim de estabelecer reflexões em torno da vivência do refúgio na contemporaneidade.

Literature (General), Manners and customs (General)

Detail DOI Sumber

arXiv Open Access 2022

Audio Matters Too: How Audial Avatar Customization Enhances Visual Avatar Customization

Dominic Kao, Rabindra Ratan, Christos Mousas et al.

Avatar customization is known to positively affect crucial outcomes in numerous domains. However, it is unknown whether audial customization can confer the same benefits as visual customization. We conducted a preregistered 2 x 2 (visual choice vs. visual assignment x audial choice vs. audial assignment) study in a Java programming game. Participants with visual choice experienced higher avatar identification and autonomy. Participants with audial choice experienced higher avatar identification and autonomy, but only within the group of participants who had visual choice available. Visual choice led to an increase in time spent, and indirectly led to increases in intrinsic motivation, immersion, time spent, future play motivation, and likelihood of game recommendation. Audial choice moderated the majority of these effects. Our results suggest that audial customization plays an important enhancing role vis-à-vis visual customization. However, audial customization appears to have a weaker effect compared to visual customization. We discuss the implications for avatar customization more generally across digital applications.

en cs.HC

Detail DOI Sumber

DOAJ Open Access 2021

deus-dará: O Rio de Janeiro como espaço de (des)encontro durante séculos

Helena Gonçalo Ferreira

deus-dará é um romance de Alexandra Lucas Coelho, descrito ao longo de sete dias, em diferentes anos, revelando uma clara inspiração no Génesis, que nos apresenta sete personagens como o seu subtítulo destaca: "sete dias na vida de São Sebastião do Rio de Janeiro ou o Apocalipse segundo Lucas, Judite, Zaca, Tristão, Inês, Gabriel & Noé". Através destas personagens e com o Rio de Janeiro presente, o narrador híbrido que tanto escreve em português de Portugal como em português do Brasil, conduz os leitores ao longo de cinco séculos de história, às conjunturas sociais, religiosas e culturais de Portugal e do Brasil durante o período de colonização e pós-colonização. Partindo, então, dos trabalhos académicos que refletem sobre os estudos pós-coloniais, pretende-se, através desta obra, explorar as construções e reconstruções de significados dos vários pontos de vista sobre a história da (des)colonização portuguesa e suas consequências, uma vez que para além da herança colonial portuguesa que se vê na violência que os brasileiros vivem, particularmente, no Rio de Janeiro, este livro transporta os portugueses para a questão da relação com o seu passado, no momento presente.

Literature (General), Manners and customs (General)

Detail DOI Sumber

DOAJ Open Access 2021

A arma da teoria: pensamento africano e literatura

Roberto Vecchi

O artigo procura repensar alguma caraterísticas da filosofia africana no espaço da língua portuguesa em comparação com outros contextos africanos que foram atravessados pelo processo de colonização (em particular a África francesa e inglesa). Depois dos processos de independência que se fundaram sobre pensamentos ideologicamente fortes (nomeadamente o caso de Amílcar Cabral), os tempos pós-coloniais não parecem ser marcados por reflexões radicais como ocorreu em outros âmbitos, sobretudo a partir de um enxerte favorável de diferentes pensamentos radicais. Discutindo a diferença entre filosofia e pensamento no quadro da revisão do pensamento sobre a comunidade, que desconstrói a obra identitária e as narrativas de nação, propõe-se encontrar na literatura – e não na filosofia no sentido estrito, nos contextos da África de língua portuguesa – estilhaços deste pensamento radical que se encontram disseminados em numerosos textos. Partindo do caso da reconfiguração da comunidade no clássico Luuanda de Luandino Vieira, a perspetiva que se esboça é a de um mapeamento do pensamento africano no espaço da língua portuguesa disseminado em textos literários.

Literature (General), Manners and customs (General)

Detail DOI Sumber

DOAJ Open Access 2021

A doutrina de combate da expansão imperial na cronística portuguesa da Terra de Santa Cruz

Wellington José Gomes Freire

O presente artigo pretende se deter sobre a representação dos modos de condução da guerra da expansão imperial portuguesa quinhentista contida nas narrativas cronisticas que tratam da presença militar lusitana na América portuguesa. Pleiteia-se que os métodos de combate descritos nos textos sugerem que a revolução nos assuntos militares, termo que designa na bibliografia especializada o processo de modernização dos exércitos europeus modernos, não fincou raízes em solo lusitano quatrocentista e quinhentista. Os guerreiros e conquistadores que se assenhorearam de uma vasta porção do globo utilizaram predominantemente de táticas de incursão de infantaria desordenada ao estilo de razias. O estudo se baseou em um corpus constituído por cronistas quatrocentistas e quinhentistas: Gomes Eanes de Zurara; Rui de Pina; João de Barros; Lopes de Castanheda; Gaspar Correia; Gabriel Soares de Souza e Frei Vicente de Salvador.

Literature (General), Manners and customs (General)

Detail DOI Sumber

arXiv Open Access 2021

Customer Sentiment Analysis using Weak Supervision for Customer-Agent Chat

Navdeep Jain

Prior work on sentiment analysis using weak supervision primarily focuses on different reviews such as movies (IMDB), restaurants (Yelp), products (Amazon).~One under-explored field in this regard is customer chat data for a customer-agent chat in customer support due to the lack of availability of free public data. Here, we perform sentiment analysis on customer chat using weak supervision on our in-house dataset. We fine-tune the pre-trained language model (LM) RoBERTa as a sentiment classifier using weak supervision. Our contribution is as follows:1) We show that by using weak sentiment classifiers along with domain-specific lexicon-based rules as Labeling Functions (LF), we can train a fairly accurate customer chat sentiment classifier using weak supervision. 2) We compare the performance of our custom-trained model with off-the-shelf google cloud NLP API for sentiment analysis. We show that by injecting domain-specific knowledge using LFs, even with weak supervision, we can train a model to handle some domain-specific use cases better than off-the-shelf google cloud NLP API. 3) We also present an analysis of how customer sentiment in a chat relates to problem resolution.

en cs.CL, cs.AI

Detail Sumber

arXiv Open Access 2020

Disaggregating Customer-level Behind-the-Meter PV Generation Using Smart Meter Data and Solar Exemplars

Fankun Bu, Kaveh Dehghanpour, Yuxuan Yuan et al.

Customer-level rooftop photovoltaic (PV) has been widely integrated into distribution systems. In most cases, PVs are installed behind-the-meter (BTM), and only the net demand is recorded. Therefore, the native demand and PV generation are unknown to utilities. Separating native demand and solar generation from net demand is critical for improving grid-edge observability. In this paper, a novel approach is proposed for disaggregating customer-level BTM PV generation using low-resolution but widely available hourly smart meter data. The proposed approach exploits the strong correlation between monthly nocturnal and diurnal native demands and the high similarity among PV generation profiles. First, a joint probability density function (PDF) of monthly nocturnal and diurnal native demands is constructed for customers without PVs, using Gaussian mixture modeling (GMM). Deviation from the constructed PDF is utilized to probabilistically assess the monthly solar generation of customers with PVs. Then, to identify hourly BTM solar generation for these customers, their estimated monthly solar generation is decomposed into an hourly timescale; to do this, we have proposed a maximum likelihood estimation (MLE)-based technique that utilizes hourly typical solar exemplars. Leveraging the strong monthly native demand correlation and high PV generation similarity enhances our approach's robustness against the volatility of customers' hourly load and enables highly accurate disaggregation. The proposed approach has been verified using real native demand and PV generation data.

en eess.SP

Detail DOI Sumber

arXiv Open Access 2020

Customized Graph Neural Networks

Yiqi Wang, Yao Ma, Wei Jin et al.

Recently, Graph Neural Networks (GNNs) have greatly advanced the task of graph classification. Typically, we first build a unified GNN model with graphs in a given training set and then use this unified model to predict labels of all the unseen graphs in the test set. However, graphs in the same dataset often have dramatically distinct structures, which indicates that a unified model may be sub-optimal given an individual graph. Therefore, in this paper, we aim to develop customized graph neural networks for graph classification. Specifically, we propose a novel customized graph neural network framework, i.e., Customized-GNN. Given a graph sample, Customized-GNN can generate a sample-specific model for this graph based on its structure. Meanwhile, the proposed framework is very general that can be applied to numerous existing graph neural network models. Comprehensive experiments on various graph classification benchmarks demonstrate the effectiveness of the proposed framework.

en cs.LG, stat.ML

Detail Sumber

Hasil untuk "Manners and customs (General)"