-
Bentsen Kristiansen heeft een update geplaatst 1 week, 3 dagen geleden
Contact predictions within a protein has recently become a viable method for accurate prediction of protein structure. Using predicted distance distributions has been shown in many cases to be superior to only using a binary contact annotation. Using predicted inter-protein distances has also been shown to be able to dock some protein dimers.
Here, we present pyconsFold. Using CNS as its underlying folding mechanism and predicted contact distance it outperforms regular contact prediction based modelling on our dataset of 210 proteins. Panobinostat in vivo It performs marginally worse than the state of the art pyRosetta folding pipeline but is on average about 20 times faster per model. More importantly pyconsFold can also be used as a fold-and-dock protocol by using predicted inter-protein contacts/distances to simultaneously fold and dock two protein chains.
pyconsFold is implemented in Python 3 with a strong focus on using as few dependencies as possible for longevity. It is available both as a pip package in Python 3 and as source code on GitHub and is published under the GPLv3 license.
Install instructions, examples and parameters can be found in the supplemental notes.
The data underlying this article together with source code are available on github, at https//github.com/johnlamb/pyconsfold.
The data underlying this article together with source code are available on github, at https//github.com/johnlamb/pyconsfold.Ogburn et al. (Am J Epidemiol. 2021;000(0)000-000) raise a cautionary tale for epidemiological data fusion bias may occur if a variable completely missing in the primary dataset is imputed according to a regression model estimated from an auxiliary dataset. However, in some specific settings, solution may exist. Focusing on a linear outcome regression model with a missing covariate, we show that the bias can be eliminated if the underlying imputation model for the missing covariate is nonlinear in the common variables measured in both datasets. Otherwise, we describe two alternative approaches existing in the data fusion literature that could partially resolve this issue one estimates the outcome model by leveraging an additional validation dataset containing joint observations of the outcome and the missing covariate, and the other offers informative bounds for the outcome regression coefficients without using validation data. We justify these three methods on a linear outcome model and briefly discuss their extension to general settings.
Efficient sampling of conformational space is essential for elucidating functional/allosteric mechanisms of proteins and generating ensembles of conformers for docking applications. However, unbiased sampling is still a challenge especially for highly flexible and/or large systems. To address this challenge, we describe a new implementation of our computationally efficient algorithm ClustENMD that is integrated with ProDy and OpenMM softwares. This hybrid method performs iterative cycles of conformer generation using elastic network model (ENM) for deformations along global modes, followed by clustering and short molecular dynamics (MD) simulations. ProDy framework enables full automation and analysis of generated conformers and visualization of their distributions in the essential subspace.
ClustENMD is open-source and freely available under MIT License from https//github.com/prody/ProDy.
Supplementary materials comprising method details, figures, table and tutorial are available at Bioinformatics online.
Supplementary materials comprising method details, figures, table and tutorial are available at Bioinformatics online.
The identification and discovery of phenotypes from high content screening (HCS) images is a challenging task. Earlier works use image analysis pipelines to extract biological features, supervised training methods or generate features with neural networks pretrained on non-cellular images. We introduce a novel unsupervised deep learning algorithm to cluster cellular images with similar Mode-of-Action (MOA) together using only the images’ pixel intensity values as input. It corrects for batch effect during training. Importantly, our method does not require the extraction of cell candidates and works from the entire images directly.
The method achieves competitive results on the labelled subset of the BBBC021 dataset with an accuracy of 97.09% for correctly classifying the MOA by nearest neighbors matching. Importantly, we can train our approach on unannotated datasets. Therefore, our method can discover novel MOAs and annotate unlabelled compounds. The ability to train end-to-end on the full resolution images makes our method easy to apply and allows it to further distinguish treatments by their effect on proliferation.
Our code is available at https//github.com/Novartis/UMM-Discovery.
Supplementary data are available at Bioinformatics online.
Supplementary data are available at Bioinformatics online.The future of single cell diversity screens involves ever-larger sample sizes, dictating the need for higher throughput methods with low analytical noise to accurately describe the nature of the cellular system. Current approaches are limited by the Poisson statistic, requiring dilute cell suspensions and associated losses in throughput. In this contribution, we apply Dean entrainment to both cell and bead inputs, defining different volume packets to effect efficient co-encapsulation. Volume ratio scaling was explored to identify optimal conditions. This enabled the co-encapsulation of single cells with reporter beads at rates of ∼1 million cells per hour, while increasing assay signal-to-noise with cell multiplet rates of ∼2.5% and capturing ∼70% of cells. The method, called Pirouette coupling, extends our capacity to investigate biological systems.The organometallic H-cluster of the [FeFe]-hydrogenase consists of a [4Fe-4S] cubane bridged via a cysteinyl thiolate to a 2Fe subcluster ([2Fe]H) containing CO, CN-, and dithiomethylamine (DTMA) ligands. The H-cluster is synthesized by three dedicated maturation proteins the radical SAM enzymes HydE and HydG synthesize the non-protein ligands, while the GTPase HydF serves as a scaffold for assembly of [2Fe]H prior to its delivery to the [FeFe]-hydrogenase containing the [4Fe-4S] cubane. HydG uses l-tyrosine as a substrate, cleaving it to produce p-cresol as well as the CO and CN- ligands to the H-cluster, although there is some question as to whether these are formed as free diatomics or as part of a [Fe(CO)2(CN)] synthon. Here we show that Clostridium acetobutylicum (C.a.) HydG catalyzes formation of multiple equivalents of free CO at rates comparable to those for CN- formation. Free CN- is also formed in excess molar equivalents over protein. A g = 8.9 EPR signal is observed for C.a. HydG reconstituted to load the 5th “dangler” iron of the auxiliary [4Fe-4S][FeCys] cluster and is assigned to this “dangler-loaded” cluster state.