Kazuki Nakamae, Takayuki Suzuki, Sora Yonezawa
et al.
Base-editing technologies, particularly cytosine base editors (CBEs), allow precise gene modification without introducing double-strand breaks; however, unintended RNA off-target effects remain a critical concern and are under studied. To address this gap, we developed the Pipeline for CRISPR-induced Transcriptome-wide Unintended RNA Editing (PiCTURE), a standardized computational pipeline for detecting and quantifying transcriptome-wide CBE-induced RNA off-target events. PiCTURE identifies both canonical ACW (W = A or T/U) motif-dependent and non-canonical RNA off-targets, revealing a broader WCW motif that underlies many unanticipated edits. Additionally, we developed two machine learning models based on the DNABERT-2 language model, termed STL and SNL, which outperformed motif-only approaches in terms of accuracy, precision, recall, and F1 score. To demonstrate the practical application of our predictive model for CBE-induced RNA off-target risk, we integrated PiCTURE outputs with the Predicting RNA Off-target compared with Tissue-specific Expression for Caring for Tissue and Organ (PROTECTiO) pipeline and estimated RNA off-target risk for each transcript showing tissue-specific expression. The analysis revealed differences among tissues: while the brain and ovaries exhibited relatively low off-target burden, the colon and lungs displayed relatively high risks. Our study provides a comprehensive framework for RNA off-target profiling, emphasizing the importance of advanced machine learning-based classifiers in CBE safety evaluations and offering valuable insights to inform the development of safer genome-editing therapies.
We present a parallel variant of Pruned Landmark Labelling (PLL) that is optimised for the preprocessing of hub labels on directed acyclic graphs (DAGs). This method was developed during a seminar at the Karlsruhe Institute of Technology (KIT), focusing on time-expanded graphs that model public transport networks. The approach leverages the topological properties of DAGs to enable a novel parallel construction of hub labels.
We present a simpler and faster algorithm for low-diameter decompositions on directed graphs, matching the $O(\log n\log\log n)$ loss factor from Bringmann, Fischer, Haeupler, and Latypov (ICALP 2025) and improving the running time to $O((m+n\log\log n)\log^2n)$.
Jaberi [7] presented approximation algorithms for the problem of computing a minimum size 2-vertex strongly biconnected subgraph in directed graphs. We have implemented approximation algorithms presented in [7] and we have tested the implementation on some graphs. The experimental results show that these algorithms work well in practice.
We show an $(1+ε)$-approximation algorithm for maintaining maximum $s$-$t$ flow under $m$ edge insertions in $m^{1/2+o(1)} ε^{-1/2}$ amortized update time for directed, unweighted graphs. This constitutes the first sublinear dynamic maximum flow algorithm in general sparse graphs with arbitrarily good approximation guarantee.
We show that for each non-negative integer k, every bipartite tournament either contains k arc-disjoint cycles or has a feedback arc set of size at most 7(k - 1).
In the 4-path vertex cover problem, the input is an undirected graph $G$ and an integer $k$. The goal is to decide whether there is a set of vertices $S$ of size at most $k$ such that every path with 4 vertices in $G$ contains at least one vertex of $S$. In this paper we give a parameterized algorithm for 4-path vertex cover whose time complexity is $O^*(2.619^k)$.
Recent work has suggested enhancing Bloom filters by using a pre-filter, based on applying machine learning to model the data set the Bloom filter is meant to represent. Here we model such learned Bloom filters, clarifying what guarantees can and cannot be associated with such a structure.
New solution method for the systems of linear equations in commutative integral domains is proposed. Its complexity is the same that the complexity of the matrix multiplication.
Given a graph on $n$ vertices and an integer $k$, the feedback vertex set problem asks for the deletion of at most $k$ vertices to make the graph acyclic. We show that a greedy branching algorithm, which always branches on an undecided vertex with the largest degree, runs in single-exponential time, i.e., $O(c^k\cdot n^2)$ for some constant $c$.
Given an $n$-point metric space, consider the problem of finding a point with the minimum sum of distances to all points. We show that this problem has a randomized algorithm that {\em always} outputs a $(2+ε)$-approximate solution in an expected $O(n/ε^2)$ time for each constant $ε>0$. Inheriting Indyk's algorithm, our algorithm outputs a $(1+ε)$-approximate $1$-median in $O(n/ε^2)$ time with probability $Ω(1)$.
The model set of a general Boolean function in CNF is calculated in a compressed format, using novel wildcards. This method can be explained in very visual ways. Preliminary comparison with existing methods (BDD's and Mathematica's ESOP command) looks promising but our algorithm begs for a C encoding which would render it comparable in more systematic ways.
In this note we point out various errors in the paper by Rashmi Gupta and R. R. Saxena, Set packing problem with linear fractional objective function, International Journal of Mathematics and Computer Applications Research (IJMCAR), 4 (2014) 9 - 18. We also provide some additional results.
In this paper, first we give a sequential linear-time algorithm for the longest path problem in meshes. This algorithm can be considered as an improvement of [13]. Then based on this sequential algorithm, we present a constant-time parallel algorithm for the problem which can be run on every parallel machine.
This document specifies the Relational Schema Protocol (RSP). RSP enables loosely coupled applications to share and exchange relational data. It defines fixed message format for an arbitrary relational schema so that the changes in the data schema do not affect the message format. This prevents the interacting applications from having to be reimplemented during the data schema evolvement.
We give a constant factor approximation algorithm for the asymmetric traveling salesman problem when the support graph of the solution of the Held-Karp linear programming relaxation has bounded orientable genus.
We prove that longest common prefix (LCP) information can be stored in much less space than previously known. More precisely, we show that in the presence of the text and the suffix array, o(n) additional bits are sufficient to answer LCP-queries asymptotically in the same time that is needed to retrieve an entry from the suffix array. This yields the smallest compressed suffix tree with sub-logarithmic navigation time.
The purpose of this note is to attach a name to a natural class of combinatorial problems and to point out that this class includes many important special cases. We also show that a simple problem of placing nonoverlapping labels on a rectangular map is NP-complete.