profile
viewpoint

issue commenttheislab/scanpy

mnn_correct() ValueError: not enough values to unpack (expected 3, got 1)

Why is it not maintained? If it doesn’t work, that’s a bug.

Gpasquini

comment created time in 16 hours

issue commenttheislab/scanpy

Remove batch effect ("Integrate" in Seurat")

you need to tell us which data and which code you used so we can see if it’s a bug or if you made a mistake.

grimwoo

comment created time in 16 hours

issue commenttheislab/scvelo

Back to flit?

You forgot to upload a wheel for the newest release:

python setup.py bdist_wheel
twine upload dist/scvelo-....whl
flying-sheep

comment created time in a day

GollumEvent

delete branch flying-sheep/noitadocs

delete branch : crypto-pathlib

delete time in 3 days

push eventflying-sheep/noitadocs

Philipp A

commit sha 33767db8f3cf285047ff923d5c1780867c8a7fbc

Use correct module name & pathlib

view details

push time in 5 days

PR opened noita-player/noitadocs

Some fixes

This

  • Adds the default path for linux to the autodetection
  • Uses pathlib instead of shitty os.path
  • Fixes a print statement
  • Orders the imports: builtin → 3rd party → 1st party
+60 -55

0 comment

5 changed files

pr created time in 5 days

push eventflying-sheep/noitadocs

Philipp A

commit sha b579d0e8537cf80ffa45dd9535a1e12e4503840e

Use correct module name & pathlib

view details

push time in 5 days

create barnchflying-sheep/noitadocs

branch : crypto-pathlib

created branch time in 5 days

fork flying-sheep/noitadocs

documentation on engine internals for noita modders

fork in 5 days

GollumEvent
GollumEvent
GollumEvent
GollumEvent
GollumEvent
GollumEvent
GollumEvent
GollumEvent
GollumEvent
GollumEvent
GollumEvent

push eventflying-sheep/get_version

Philipp A

commit sha d76ef5b9f022c2073b963d30480d4585b2ede842

Simplify travis

view details

push time in 6 days

push eventflying-sheep/get_version

Philipp A

commit sha c3bc5af9ca2636f1c44de42fc983f32b68e543b3

Don’t test 3.5 anymore

view details

push time in 6 days

Pull request review commenttheislab/scanpy

dtype fixes for downsample and normalization

  def _normalize_data(X, counts, after=None, copy=False):     X = X.copy() if copy else X+    if issubclass(X.dtype.type, (int, np.integer)):+        X = X.astype(np.float32)  # TODO: Check if float64 should be used     after = np.median(counts[counts>0]) if after is None else after     counts += (counts == 0)-    counts /= after+    counts = counts / after     if issparse(X):         sparsefuncs.inplace_row_scale(X, 1/counts)     else:-        X /= counts[:, None]-    return X if copy else None+        np.divide(X, counts[:, None], out=X)

Why is this and the above change necessary? I though that if __itruediv__ is not available, python falls back to __truediv__ anyway?

ivirshup

comment created time in 6 days

push eventflying-sheep/knn.covertree

Philipp A

commit sha 9c3e4cae5b43cb1be2f5dd853e58b5af5558dbfa

literally only added one .

view details

push time in 6 days

issue closedtheislab/anndata2ri

Fail to install on Windows

Hello, I'm trying to install anndata2ri, but I keep running into errors. When I try to install through bioconda, I get some environment conflict (more on that in the text-file), any idea on as to why that would be?

<details><summary>Error</summary>

(sc-analysis-win) C:\Users\felix\Downloads>conda install anndata2ri
Collecting package metadata (current_repodata.json): done
Solving environment: failed with initial frozen solve. Retrying with flexible solve.
Solving environment: failed with repodata from current_repodata.json, will retry with next repodata source.
Collecting package metadata (repodata.json): done
Solving environment: failed with initial frozen solve. Retrying with flexible solve.
Solving environment: -
Found conflicts! Looking for incompatible packages.
This can take several minutes.  Press CTRL-C to abort.\
Examining r-blob:  90%|█████████████████████████████████████████████████████▊      | 383/427 [00:00<00:00, 3073.64it/s]\Examining r-r6:  98%|████████████████████████████████████████████████████████████▉ | 420/427 [00:00<00:00, 3073.64it/-
Comparing specs that have this dependency:   0%|                                                | 0/32 [00:00<?, ?it/\
/ mparing specs that have this dependency:  12%|█████                                   | 4/32 [00:01<00:08,  3.16it/\ \
Comparing specs that have this dependency:  34%|█████████████▍                         | 11/32 [00:05<00:10,  1.95it/s]|
Finding shortest conflict path for r-r6[version='>=2.2.2']:  25%|██████                  | 1/4 [00:03<00:10,  3.36s/i/ |
Finding shortest conflict path for r-r6:  50%|█████████████████████▌                     | 2/4 [00:03<00:03,  1.68s/i\ /
Comparing specs that have this dependency:  44%|█████████████████                      | 14/32 [00:14<00:18,  1.03s/i- \
Examining r-r6:  99%|█████████████████████████████████████████████████████████████▏| 421/427 [00:20<00:00, 3073.64it/\ |
Finding shortest conflict path for r-r6:  17%|███████▏                                   | 1/6 [00:00<00:04,  1.02it/| \
Comparing specs that have this dependency:  66%|█████████████████████████▌             | 21/32 [02:01<01:03,  5.79s/i/ |
failed                                                                                                                 /
                                                                                                                       -
UnsatisfiableError: The following specifications were found to be incompatible with each other:



Package r-r6 conflicts for:
r-dbplyr -> r-r6[version='>=2.2.2']
r-callr -> r-r6
r-callr -> r-processx[version='>=3.1.0'] -> r-testthat -> r-r6[version='>=2.2.0']
r-purrr -> r-dplyr -> r-r6[version='>=2.2.2']
r-recommended -> mro-basics=3.5.1 -> r-r6[version='>=2.2.2,<2.2.3.0a0']
r-httr -> r-r6
r-progress -> r-r6
r-r6
r-promises -> r-r6
r-modelr -> r-dplyr -> r-r6[version='>=2.2.2']
r-shiny -> r-r6[version='>=2.0']
r-ggplot2 -> r-scales[version='>=0.4.1'] -> r-r6
r-tidyr -> r-dplyr[version='>=0.7.0'] -> r-r6[version='>=2.2.2']
r-rbokeh -> r-gistr -> r-dplyr -> r-r6[version='>=2.2.2']
r-reprex -> r-callr[version='>=2.0.0'] -> r-r6
r-recipes -> r-dplyr -> r-r6[version='>=2.2.2']
r-readr -> r-r6
r-scales -> r-r6
r-reprex -> r-callr[version='>=2.0.0'] -> r-processx[version='>=3.1.0'] -> r-testthat -> r-r6[version='>=2.2.0']
r-selectr -> r-r6
r-essentials -> r-recommended[version='>=3.6.0'] -> mro-basics=3.5.1 -> r-r6[version='>=2.2.2,<2.2.3.0a0']
r-caret -> r-recipes[version='>=0.0.1'] -> r-dplyr -> r-r6[version='>=2.2.2']
rpy2 -> r-dbplyr -> r-r6[version='>=2.2.2']
r-shiny -> r-httpuv[version='>=1.3.3'] -> r-r6
r-rbokeh -> r-r6
r-broom -> r-dplyr -> r-r6[version='>=2.2.2']
r-httpuv -> r-r6
r-tidyverse -> r-dbplyr[version='>=1.1.0'] -> r-r6[version='>=2.2.2']
r-essentials -> r-tidyverse[version='>=1.2.1'] -> r-reprex[version='>=0.1.1'] -> r-callr[version='>=2.0.0'] -> r-processx[version='>=3.1.0'] -> r-testthat -> r-r6[version='>=2.2.0']
r-processx -> r-r6
r-tidyverse -> r-reprex[version='>=0.1.1'] -> r-callr[version='>=2.0.0'] -> r-processx[version='>=3.1.0'] -> r-testthat -> r-r6[version='>=2.2.0']
r-processx -> r-testthat -> r-r6[version='>=2.2.0']
r-haven -> r-readr[version='>=0.1.0'] -> r-r6
r-pbdzmq -> r-r6
r-readxl -> r-progress -> r-r6
r-essentials -> r-dplyr[version='>=0.7.6'] -> r-r6[version='>=2.0|>=2.2.2']
r-rvest -> r-httr[version='>=0.5'] -> r-r6
r-dplyr -> r-r6[version='>=2.2.2']
r-irkernel -> r-pbdzmq[version='>=0.2_1'] -> r-r6
r-tidyselect -> r-purrr -> r-dplyr -> r-r6[version='>=2.2.2']
Note that strict channel priority may have removed packages required for satisfiability.

</details>

When I try to install through pip, I get an error message, it seems to be a problem with rpy2, but I don't exactly know what:

(sc-analysis-win) C:\Users\felix\Downloads>pip install anndata2ri
Collecting anndata2ri
  Using cached https://files.pythonhosted.org/packages/b7/09/34c48ce5b4e99d022dcde7e1643f4a9c1523fc91bb916b1c97891bdbdadd/anndata2ri-1.0-py3-none-any.whl
Requirement already satisfied: anndata in c:\programdata\anaconda3\envs\sc-analysis-win\lib\site-packages (from anndata2ri) (0.6.22.post1)
Collecting rpy2>=3.0.1 (from anndata2ri)
  Using cached https://files.pythonhosted.org/packages/7e/e0/7da849bb6cf47466ceb28a75f930e61c311878882c275dfb4bbb4fdcc3cb/rpy2-3.2.0.tar.gz
    ERROR: Command errored out with exit status 1:
     command: 'C:\ProgramData\Anaconda3\envs\sc-analysis-win\python.exe' -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'C:\\Users\\felix\\AppData\\Local\\Temp\\pip-install-0z14nnlr\\rpy2\\setup.py'"'"'; __file__='"'"'C:\\Users\\felix\\AppData\\Local\\Temp\\pip-install-0z14nnlr\\rpy2\\setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' egg_info --egg-base pip-egg-info
         cwd: C:\Users\felix\AppData\Local\Temp\pip-install-0z14nnlr\rpy2\
    Complete output (17 lines):
    Der Befehl "sh" ist entweder falsch geschrieben oder
    konnte nicht gefunden werden.
    Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "C:\Users\felix\AppData\Local\Temp\pip-install-0z14nnlr\rpy2\setup.py", line 102, in <module>
        c_extension_status = get_r_c_extension_status()
      File "C:\Users\felix\AppData\Local\Temp\pip-install-0z14nnlr\rpy2\setup.py", line 91, in get_r_c_extension_status
        *situation.get_r_flags(r_home, '--ldflags')
      File "C:\Users\felix\AppData\Local\Temp\pip-install-0z14nnlr\rpy2\rpy2\situation.py", line 176, in get_r_flags
        allow_empty=False)))
      File "C:\Users\felix\AppData\Local\Temp\pip-install-0z14nnlr\rpy2\rpy2\situation.py", line 145, in _get_r_cmd_config
        universal_newlines=True
      File "C:\ProgramData\Anaconda3\envs\sc-analysis-win\lib\subprocess.py", line 395, in check_output
        **kwargs).stdout
      File "C:\ProgramData\Anaconda3\envs\sc-analysis-win\lib\subprocess.py", line 487, in run
        output=stdout, stderr=stderr)
    subprocess.CalledProcessError: Command '('C:\\ProgramData\\Anaconda3\\envs\\sc-analysis-win\\lib\\R\\bin\\x64\\R', 'CMD', 'config', '--ldflags')' returned non-zero exit status 1.
    ----------------------------------------
ERROR: Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.

I'm sorry, but some of the error is in German (I guess due to my system language), it just means: The command 'sh' is either misspelled or does not exist.

I'm running Windows 10 64-bit with Anaconda.

Thank you very much already!

Regards, Felix.

closed time in 6 days

fkoegel

issue commenttheislab/anndata2ri

Fail to install on Windows

As you can see from the log, the error happens when trying to install the rpy2 dependency, not anndata2ri. Please search their issues and file a new one if you can’t find anyone with the same problem: https://bitbucket.org/rpy2/rpy2/issues

fkoegel

comment created time in 6 days

issue commenttheislab/scanpy

Use GLM-PCA instead of normal PCA?

OK, seems like I misunderstood the point about zero inflation here. You just meant “large number of zeroes” as in “pretty sparse” then?

A factor of 10 isn’t that bad for something that’s more complex, and I doubt PCA speed is the bottleneck for most datasets.

So not a replacement, but an enhancement. As such, it would probably live in scanpy.external except if you want to develop it within scanpy instead of as a separate package (which is possible, but would tie you to our – currently slow but we’ll get better) release cycle.

flying-sheep

comment created time in 6 days

push eventflying-sheep/knn.covertree

Philipp A

commit sha 6aca358dd3889a1805a82d9d31f372200f7d17c1

added doi

view details

push time in 6 days

push eventtheislab/single-cell-tutorial

Philipp A

commit sha c9e05bd3464d4a79ddaf7bfeac8d35245d698263

anndata2ri is on PyPI

view details

push time in 6 days

pull request commenttheislab/scanpy

Fix pl paga when adata.uns color list was not set

OK, sounds good!

The problems codacy points out are probably in old, moved code and therefore not yours.

fidelram

comment created time in 6 days

issue closedtheislab/scanpy

AttributeError: module 'tables' has no attribute 'which_lib_version'

Hello,

I am a beginner in the bioinformatics field. I have been used scanpy for one year, and I really like it. But yesterday after I downloaded and installed the new developmental version from scanpy github, when I ran the first command "import scanpy as sc", it displayed errors :AttributeError: module 'tables' has no attribute 'which_lib_version'. I don't know how to deal with this error. Does the "module 'tables' " in the error message mean pytables? After I saw the message, I reinstalled both pytables and scanpy through conda channel. But the error is not fixed. I thus wish to get some suggestions. Thank you so much!

closed time in 7 days

yuntianzhishang

issue commenttheislab/scanpy

AttributeError: module 'tables' has no attribute 'which_lib_version'

well, generally updating everything could fix it, but another alternative is that you have a tables.py in your $PYTHONPATH (sys.path) which gets imported instead of the correct pytables version

yuntianzhishang

comment created time in 7 days

pull request commenttheislab/scanpy

Fix pl paga when adata.uns color list was not set

Looks good, but I can’t really see your changes in the diff because you moved the functions. Can you paste a diff of just the function code as a comment?

fidelram

comment created time in 7 days

issue commentIRkernel/IRkernel

IRkernel installed, but still get "No kernel grammar for Null grammar found" in hydrogen

Well, read the description in your last screenshot, your mapping is the wrong way around.

Maxico11

comment created time in 7 days

issue closedIRkernel/IRkernel

IRkernel installed, but still get "No kernel grammar for Null grammar found" in hydrogen

Hi,

I installed the IRkernel via the official instructions (using miniconda prompt with R.exe), and by adding the PATH variables necessary to bypass the 127 error when running IRkernel::installspec().

When I run jupyter kernelspec list in the conda command prompt, it returns image which shows that the ir kernel is not on the same path as the python one.

I dragged and dropped the kernel in my miniconda path and the file is there as we can see image The problem is that when I try running a simple R code in atom (1+1), it still gives me the error message "No kernel grammar for Null grammar found"

I wrote the IRkernel as such in hydrogen image

Could you help me please to understand why hydrogen wont recognize the language? Thank you very much

Maxime

closed time in 7 days

Maxico11

issue commentIRkernel/IRkernel

tibble in exported html is not as beautiful as it is in jupyter

I don’t add any styling, it’s up to your frontend to do that.

You should use a custom template. Make sure you read the rest of that docs page (e.g. “Template structure”) to know how to build one. I assume you just need to inherit from the default and override the html_head block to contain a <style/> tag.

Thank you for thinking my table rendering is beautiful!

Youguang

comment created time in 7 days

issue closedIRkernel/IRkernel

Error on Installing IRkernel

I have to install IRkernel

devtools::install_github('IRkernel/IRkernel')

The installation goes normal till below

image

I tried removing digest and installing it again. I got the same error. Please help.

closed time in 7 days

kumodjha

issue commentIRkernel/IRkernel

Error on Installing IRkernel

It tells you it can’t remove the prior installation of “digest”, which is clearly unrelated to IRkernel.

No idea what causes this, maybe you should google or search/ask on stackoverflow?

kumodjha

comment created time in 7 days

issue openednose-devs/nose2

Support configuring through pyproject.toml

The tool table section of PEP 518 says that tools should store their name in [tool.<pypiname>] in pyproject.toml.

created time in 7 days

issue commentpsf/black

Offer option other than pyproject.toml for config

pylint added pyproject.toml support! So if one don’t need to configure pytest, they can have all their config in pyproject.toml, e.g. using tox, black, pylint, isort, and poetry

altendky

comment created time in 7 days

push eventflying-sheep/awesome-python-packaging

Philipp A

commit sha 76123e8fc546ccd52f7a24d3d4a23e43161e9048

Pylint now supports pyproject.toml

view details

push time in 7 days

pull request commentPyCQA/pylint

Can read setup.cfg and pyproject.toml files

awesome, thank you!

AWhetter

comment created time in 7 days

issue openedtheislab/scanpy

Use GLM-PCA instead of normal PCA?

Seems like a rigorous way to deal with zero inflation, since we use PCA as base for all subsequent processing. It’s apparently much faster than e.g. ZINBWaVE

cc @willtownes

created time in 7 days

push eventflying-sheep/ggplot.multistats

Philipp A

commit sha c987381d764f1220c13145896e646a2fdb39e962

review fixes

view details

push time in 7 days

push eventflying-sheep/knn.covertree

Philipp A

commit sha 13885229b81680cd7775078e11a337d565c2f528

fixed review problems

view details

push time in 7 days

issue commenttheislab/anndata

problem with reading files in release 0.6.22.post1

Is there an issue about newer h5py versions causing problems?

fidelram

comment created time in 8 days

push eventflying-sheep/phil.red

Philipp A

commit sha ed9c61623f4d5566e90bb4890e2742685b845e61

Some updates

view details

push time in 8 days

push eventflying-sheep/SpatialDE

Philipp A

commit sha c7d52f43801ef8b0d91a71efcd843057ace2b4e6

Whoops, actually use link

view details

push time in 8 days

PR opened Teichlab/SpatialDE

README fixes

I didn’t do all tables though

+835 -940

0 comment

1 changed file

pr created time in 8 days

push eventflying-sheep/SpatialDE

Philipp A

commit sha bea3dc38d1ffcdf6b983cf655b0d62fe7e3628f4

README fixes

view details

push time in 8 days

fork flying-sheep/SpatialDE

Test genes for Spatial Variation

fork in 8 days

issue openedpatoline/patoline

The hosted patobook PDF is currently pretty broken

  1. end of page 5: “typographic rules described for instance in .”
  2. when pasting the snippet above, the pasted text had “{” instead of “f”.
  3. On page 7 onward, there’s \\ instead of \ in inline code
  4. The graphic that should appear on page 14 appears cut off and very large on page 15
  5. On page 18: “Chapter ??”

created time in 8 days

Pull request review commenttheislab/scanpy

The method for annotating genes with cell types

+"""\+Annotates gene expression (cell data) with cell types.+"""+import warnings++from anndata import AnnData+import pandas as pd+++def annotator(+    adata: AnnData,+    markers: pd.DataFrame,+    num_genes: int = None,+    return_nonzero_annotations: bool = True,+    p_threshold: float = 0.05,+    p_value_fun: str = "binom",+    z_threshold: float = 1.0,+    scoring: str = "exp_ratio",+    normalize: bool = False,+):+    """\+    Annotator marks the data with cell type annotations based on marker genes.++    Over-expressed genes are selected with the Mann-Whitney U tests and cell+    types are assigned with the hypergeometric test. This function first selects+    genes from gene expression data with the Mann-Whitney U test, then annotate+    them with the hypergeometric test, and finally filter out cell types that+    have zero scores for all cells. The results are scores that tell how+    probable is each cell type for each cell.++    Parameters+    ----------+    adata+        Tabular data with gene expressions.+    markers+        The data-frame with marker genes and cell types. Data-frame has two+        columns **Gene** and **Cell Type** first holds gene names or ID and+        second cell type for this gene. Gene names must be written in the same+        format than genes in `adata`.+    num_genes+        The number of genes that the organism has.+    return_nonzero_annotations+        If true return scores only for cell types that have no zero scores.+    p_threshold+        A threshold for accepting the annotations. Annotations that have FDR+        value bellow this threshold are used.+    p_value_fun+        A function that calculates a p-value. It can be either+        `binom` that uses binom.sf or+        `hypergeom` that uses hypergeom.sf.+    z_threshold+        The threshold for selecting the gene from gene expression data.+        For each cell the attributes with z-value above this value are selected.+    scoring+        Scoring method for cell type scores. Available scores are:++        exp_ratio+            Proportion of genes typical for a cell type expressed in the cell+        sum_of_expressed_markers+            Sum of expressions of genes typical for a cell type+        log_fdr+            Negative of the logarithm of an false discovery rate (FDR) value+        log_p_value+            Negative of the logarithm of a p-value+    normalize : bool, optional (default = False)+        If this parameter is True data will be normalized during the+        a process with a log CPM normalization.+        That method works correctly data needs to be normalized.+        Set this `normalize` on True if your data are not normalized already.++    Returns+    -------+    pd.DataFrame

Let’s be perfectly clear: inplace and copy are both used, sometimes copy is being deprecated in favor of inplace.

copy means “if copy=True, return a copy of the AnnData object, with var/obs modified. if copy=False, return None and modify the AnnData object’s var/obs”

inplace means “if inplace=True, modify the AnnData object, if inplace=False, return an object with computation results”. There’s two variants for when it’s True: a) the AnnData object’s obs/var get new columns or b) the AnnData object is filtered.

I updated my comment above

PrimozGodec

comment created time in 9 days

issue closedtheislab/anndata2ri

Conversion 'py2rpy' not defined for objects of type '<class 'anndata.core.anndata.AnnData'>'

Hi,

I met an error when input an anndata to R (%%R -i adata_test -o ent_de -o paneth_de).

---------------------------------------------------------------------------
NotImplementedError                       Traceback (most recent call last)
<ipython-input-50-a0d852b414ca> in <module>
----> 1 get_ipython().run_cell_magic('R', '-i adata_test -o ent_de -o paneth_de', '\nlibrary(\'monocle\')\n# sca <- SceToSingleCellAssay(adata_test, class = "SingleCellAssay")\n')

~/anaconda3/envs/scanpy/lib/python3.6/site-packages/IPython/core/interactiveshell.py in run_cell_magic(self, magic_name, line, cell)
   2357             with self.builtin_trap:
   2358                 args = (magic_arg_s, cell)
-> 2359                 result = fn(*args, **kwargs)
   2360             return result
   2361 

<decorator-gen-828> in R(self, line, cell, local_ns)

~/anaconda3/envs/scanpy/lib/python3.6/site-packages/IPython/core/magic.py in <lambda>(f, *a, **k)
    185     # but it's overkill for just that one bit of state.
    186     def magic_deco(arg):
--> 187         call = lambda f, *a, **k: f(*a, **k)
    188 
    189         if callable(arg):

~/anaconda3/envs/scanpy/lib/python3.6/site-packages/rpy2/ipython/rmagic.py in R(self, line, cell, local_ns)
    710                         raise NameError("name '%s' is not defined" % input)
    711                 with localconverter(converter) as cv:
--> 712                     ro.r.assign(input, val)
    713 
    714         tmpd = self.setup_graphics(args)

~/anaconda3/envs/scanpy/lib/python3.6/site-packages/rpy2/robjects/functions.py in __call__(self, *args, **kwargs)
    190                 kwargs[r_k] = v
    191         return (super(SignatureTranslatedFunction, self)
--> 192                 .__call__(*args, **kwargs))
    193 
    194 

~/anaconda3/envs/scanpy/lib/python3.6/site-packages/rpy2/robjects/functions.py in __call__(self, *args, **kwargs)
    111 
    112     def __call__(self, *args, **kwargs):
--> 113         new_args = [conversion.py2rpy(a) for a in args]
    114         new_kwargs = {}
    115         for k, v in kwargs.items():

~/anaconda3/envs/scanpy/lib/python3.6/site-packages/rpy2/robjects/functions.py in <listcomp>(.0)
    111 
    112     def __call__(self, *args, **kwargs):
--> 113         new_args = [conversion.py2rpy(a) for a in args]
    114         new_kwargs = {}
    115         for k, v in kwargs.items():

~/anaconda3/envs/scanpy/lib/python3.6/functools.py in wrapper(*args, **kw)
    805                             '1 positional argument')
    806 
--> 807         return dispatch(args[0].__class__)(*args, **kw)
    808 
    809     funcname = getattr(func, '__name__', 'singledispatch function')

~/anaconda3/envs/scanpy/lib/python3.6/site-packages/rpy2/robjects/numpy2ri.py in nonnumpy2rpy(obj)
    110         # For now, go with the default_converter.
    111         # TODO: the conversion system needs an overhaul badly.
--> 112         return ro.default_converter.py2rpy(obj)
    113     else:
    114         # The conversion module was "activated"

~/anaconda3/envs/scanpy/lib/python3.6/functools.py in wrapper(*args, **kw)
    805                             '1 positional argument')
    806 
--> 807         return dispatch(args[0].__class__)(*args, **kw)
    808 
    809     funcname = getattr(func, '__name__', 'singledispatch function')

~/anaconda3/envs/scanpy/lib/python3.6/site-packages/rpy2/robjects/conversion.py in _py2rpy(obj)
     58     raise NotImplementedError(
     59         "Conversion 'py2rpy' not defined for objects of type '%s'" %
---> 60         str(type(obj))
     61     )
     62 

NotImplementedError: Conversion 'py2rpy' not defined for objects of type '<class 'anndata.core.anndata.AnnData'>'

closed time in 9 days

jphe

issue commenttheislab/anndata2ri

Conversion 'py2rpy' not defined for objects of type '<class 'anndata.core.anndata.AnnData'>'

Wait a minute, you don’t even have rpy2 version 3.x installed, right? You have to make sure you actually have the correct dependencies installed …

jphe

comment created time in 9 days

issue commenttheislab/anndata2ri

Conversion 'py2rpy' not defined for objects of type '<class 'anndata.core.anndata.AnnData'>'

did you do %load_ext rpy2.ipython?

jphe

comment created time in 9 days

issue commenttheislab/anndata

problem with reading files in release 0.6.22.post1

Not good :neutral_face: The current master state of IO is pretty experimental, I don’t think we should release it just yet.

fidelram

comment created time in 9 days

issue commenttheislab/anndata2ri

Conversion 'py2rpy' not defined for objects of type '<class 'anndata.core.anndata.AnnData'>'

In order to format exception blocks correctly, you need to encase them into this:

```pytb
Traceback...
```

About your issue: You need to explain how you initialized anndata2ri. Did you follow the instructions in the readme?

jphe

comment created time in 10 days

release flying-sheep/vscode-context

v0.0.1

released time in 10 days

created tagflying-sheep/vscode-context

tagv0.0.1

created time in 10 days

push eventflying-sheep/vscode-context

Philipp A

commit sha 6310f6a2c8ad1b8be27c04bcbcb58822fa405cbb

added logo

view details

push time in 10 days

pull request commenttheislab/scanpy

update bbknn reference

Right. Thank you!

ktpolanski

comment created time in 10 days

push eventtheislab/scanpy

Krzysztof Polanski

commit sha e46f89bf12ee5ea75c1e1064388dc43ff849840a

update bbknn reference (#861)

view details

push time in 10 days

PR merged theislab/scanpy

update bbknn reference

BBKNN came out as an actual paper a couple of months ago, updating the reference to point to it.

+5 -5

0 comment

3 changed files

ktpolanski

pr closed time in 10 days

pull request commenttheislab/anndata

[WIP] Faster sparse backed access

OK, in the light of backed mode, dense-as-sparse and sparse-as-dense makes sense.

think I've just been running into scanpy functions which mainly allow lists (to the exclusion of np.ndarrays or pd.Series)

I got rid of isinstance(thing, list) everywhere in favor of isinstance(thing, collection.abc.Sequence), but didn’t realize that’s only slightly better as ndarray isn’t a Sequence. I think Python made a mistake with half of its collection base classes being sensible in their requirements, and half of them requiring you to implement the kitchen sink. We should fix that.

Regarding your thoughts about paths: Sounds sensible, but where would we need a “sequence of paths to arbitrary anndata members”?

ivirshup

comment created time in 10 days

issue closedtheislab/anndata

Make .raw work with AnnData.concatenate

If I have a list of adata objects, for example

adata_list = [adata1, adata2, adata3]

and I do

adata = adata_list[0].concatenate(adata_list[1:])

I get that adata.raw is empty, even if adata1, adata2, and adata3 had .raw slots populated.

I think the desired behavior should be to concatenate the .raw data as well (if they all have it).

For the time being, I have worked my way around this, but I think this would be a desirable behavior.

closed time in 10 days

sjfleming

issue commenttheislab/anndata

Make .raw work with AnnData.concatenate

Great!

sjfleming

comment created time in 10 days

Pull request review commenttheislab/scanpy

Improve accuracy and speed of _get_mean_var

 import numpy as np-from scipy.sparse import issparse+from scipy import sparse+import numba  -STANDARD_SCALER_FIXED = False+def _get_mean_var(X):+    if sparse.issparse(X):+        mean, var = sparse_mean_variance_axis(X, axis=0)+    else:+        mean = np.mean(X, axis=0, dtype=np.float64)+        mean_sq = np.multiply(X, X).mean(axis=0, dtype=np.float64)+        var = mean_sq - mean ** 2+    # enforce R convention (unbiased estimator) for variance+    var *= X.shape[0] / (X.shape[0] - 1)+    return mean, var  -def _get_mean_var(X):-    # - using sklearn.StandardScaler throws an error related to-    #   int to long trafo for very large matrices-    # - using X.multiply is slower-    if not STANDARD_SCALER_FIXED:-        mean = X.mean(axis=0)-        if issparse(X):-            mean_sq = X.multiply(X).mean(axis=0)-            mean = mean.A1-            mean_sq = mean_sq.A1-        else:-            mean_sq = np.multiply(X, X).mean(axis=0)-        # enforece R convention (unbiased estimator) for variance-        var = (mean_sq - mean ** 2) * (X.shape[0] / (X.shape[0] - 1))+def sparse_mean_variance_axis(mtx: sparse.spmatrix, axis: int):+    """+    This code and internal functions are based on sklearns+    `sparsefuncs.mean_variance_axis`.++    Modifications:+    * allow deciding on the output type, which can increase accuracy when calculating the mean and variance of 32bit floats.+    * This doesn't currently implement support for null values, but could.+    * Uses numba not cython+    """+    assert axis in (0, 1)+    if isinstance(mtx, sparse.csr_matrix):+        if axis == 0:+            return sparse_mean_var_minor_axis(+                mtx.data, mtx.indices, mtx.shape[0], mtx.shape[1], np.float64+            )+        elif axis == 1:+            return sparse_mean_var_major_axis(+                mtx.data,+                mtx.indices,+                mtx.indptr,+                mtx.shape[0],+                mtx.shape[1],+                np.float64,+            )+    elif isinstance(mtx, sparse.csc_matrix):+        if axis == 0:+            return sparse_mean_var_major_axis(+                mtx.data,+                mtx.indices,+                mtx.indptr,+                mtx.shape[1],+                mtx.shape[0],+                np.float64,+            )+        elif axis == 1:+            return sparse_mean_var_minor_axis(+                mtx.data, mtx.indices, mtx.shape[1], mtx.shape[0], np.float64+            )     else:-        from sklearn.preprocessing import StandardScaler+        raise ValueError(+            "This function only works on sparse csr and csc matrices"+        )

1 - axis means 0 if axis is 1 else 1 if axis ∈ {0, 1}.

I tried mtx.T before, but it was a bit slower.

A pity. In this case we should go for my code, as yours would need a 4th function to prevent code duplication and that’s a lot of functions where 1 would be clean enough:

if isinstance(mtx, sparse.csr_matrix):
    ax_minor = 1
    shape = mtx.shape
elif isinstance(mtx, sparse.csc_matrix):
    ax_minor = 0
    shape = mtx.shape[::-1]
else:
    raise ValueError(
       "This function only works on sparse csr and csc matrices"
    )
if axis == ax_minor:
    return sparse_mean_var_major_axis(
        mtx.data, mtx.indices, mtx.indptr, *shape, np.float64
    )
else:
    return sparse_mean_var_minor_axis(
        mtx.data, mtx.indices, *shape, np.float64
    )

The only confusing part is if axis == ax_minor: ..._major_axis(), but it’s correct: For csr_matrices, axis 0 is the major axis and axis 1 is the minor axis.

ivirshup

comment created time in 10 days

Pull request review commenttheislab/scanpy

The method for annotating genes with cell types

+"""\+Annotates gene expression (cell data) with cell types.+"""+import warnings++from anndata import AnnData+import pandas as pd+++def annotator(+    adata: AnnData,+    markers: pd.DataFrame,+    num_genes: int = None,+    return_nonzero_annotations: bool = True,+    p_threshold: float = 0.05,+    p_value_fun: str = "binom",+    z_threshold: float = 1.0,+    scoring: str = "exp_ratio",+    normalize: bool = False,+):+    """\+    Annotator marks the data with cell type annotations based on marker genes.++    Over-expressed genes are selected with the Mann-Whitney U tests and cell+    types are assigned with the hypergeometric test. This function first selects+    genes from gene expression data with the Mann-Whitney U test, then annotate+    them with the hypergeometric test, and finally filter out cell types that+    have zero scores for all cells. The results are scores that tell how+    probable is each cell type for each cell.++    Parameters+    ----------+    adata+        Tabular data with gene expressions.+    markers+        The data-frame with marker genes and cell types. Data-frame has two+        columns **Gene** and **Cell Type** first holds gene names or ID and+        second cell type for this gene. Gene names must be written in the same+        format than genes in `adata`.+    num_genes+        The number of genes that the organism has.+    return_nonzero_annotations+        If true return scores only for cell types that have no zero scores.+    p_threshold+        A threshold for accepting the annotations. Annotations that have FDR+        value bellow this threshold are used.+    p_value_fun+        A function that calculates a p-value. It can be either+        `binom` that uses binom.sf or+        `hypergeom` that uses hypergeom.sf.+    z_threshold+        The threshold for selecting the gene from gene expression data.+        For each cell the attributes with z-value above this value are selected.+    scoring+        Scoring method for cell type scores. Available scores are:++        exp_ratio+            Proportion of genes typical for a cell type expressed in the cell+        sum_of_expressed_markers+            Sum of expressions of genes typical for a cell type+        log_fdr+            Negative of the logarithm of an false discovery rate (FDR) value+        log_p_value+            Negative of the logarithm of a p-value+    normalize : bool, optional (default = False)+        If this parameter is True data will be normalized during the+        a process with a log CPM normalization.+        That method works correctly data needs to be normalized.+        Set this `normalize` on True if your data are not normalized already.++    Returns+    -------+    pd.DataFrame

That’s not right, it returns an AnnData object.

Also please orient yourself at the way other scanpy functions work: They have an inplace or copy argument that controls if a copy of the original adata object is made or obs/var is updated in the original.

This function filters currently out cells, but it would be better to just mark them as having no marker cell type found.

PrimozGodec

comment created time in 10 days

Pull request review commenttheislab/scanpy

The method for annotating genes with cell types

+"""\+Annotates gene expression (cell data) with cell types.+"""+import warnings++from anndata import AnnData+import pandas as pd+++def annotator(+    adata: AnnData,+    markers: pd.DataFrame,+    num_genes: int = None,+    return_nonzero_annotations: bool = True,+    p_threshold: float = 0.05,+    p_value_fun: str = "binom",+    z_threshold: float = 1.0,+    scoring: str = "exp_ratio",+    normalize: bool = False,+):+    """\+    Annotator marks the data with cell type annotations based on marker genes.++    Over-expressed genes are selected with the Mann-Whitney U tests and cell+    types are assigned with the hypergeometric test. This function first selects+    genes from gene expression data with the Mann-Whitney U test, then annotate+    them with the hypergeometric test, and finally filter out cell types that+    have zero scores for all cells. The results are scores that tell how+    probable is each cell type for each cell.++    Parameters+    ----------+    adata+        Tabular data with gene expressions.+    markers+        The data-frame with marker genes and cell types. Data-frame has two+        columns **Gene** and **Cell Type** first holds gene names or ID and+        second cell type for this gene. Gene names must be written in the same+        format than genes in `adata`.+    num_genes+        The number of genes that the organism has.+    return_nonzero_annotations+        If true return scores only for cell types that have no zero scores.+    p_threshold+        A threshold for accepting the annotations. Annotations that have FDR+        value bellow this threshold are used.+    p_value_fun+        A function that calculates a p-value. It can be either+        `binom` that uses binom.sf or+        `hypergeom` that uses hypergeom.sf.+    z_threshold+        The threshold for selecting the gene from gene expression data.+        For each cell the attributes with z-value above this value are selected.+    scoring+        Scoring method for cell type scores. Available scores are:++        exp_ratio+            Proportion of genes typical for a cell type expressed in the cell+        sum_of_expressed_markers+            Sum of expressions of genes typical for a cell type+        log_fdr+            Negative of the logarithm of an false discovery rate (FDR) value+        log_p_value+            Negative of the logarithm of a p-value+    normalize : bool, optional (default = False)+        If this parameter is True data will be normalized during the+        a process with a log CPM normalization.+        That method works correctly data needs to be normalized.+        Set this `normalize` on True if your data are not normalized already.++    Returns+    -------+    pd.DataFrame+        Cell type for each cell for each cell. The result is a sore matrix that+        tells how probable is each cell type for each cell. Columns are cell+        types and rows are cells.++    Example+    -------++    Here is the example of annotation of dendritic cells based on their gene+    expressions. For annotation, we use data by Villani et al. (2017)[1] and+    marker genes by Franzén et al. (2019)[2].++    [1]  Villani, A. C., Satija, ... Jardine, L. (2017). Single-cell+         RNA-seq reveals new types of human blood dendritic cells, monocytes,+         and progenitors. Science, 356(6335).++    [2]  Oscar Franzén, Li-Ming Gan, Johan L M Björkegren, PanglaoDB:+         a web server for exploration of mouse and human single-cell RNA+         sequencing data, Database, Volume 2019, 2019.

Please put the references into references.rst and reference them correctly, e.g. [Villani17]_

PrimozGodec

comment created time in 10 days

push eventPrimozGodec/scanpy

Philipp A

commit sha 5399f8da7cfe0f6466497f384355a5ae9b68b2f8

Fix docs

view details

push time in 10 days

pull request commenttheislab/scanpy

The method for annotating genes with cell types

I have slight concerns about the name being too generic, but then again, this does exactly what people expect a “cell type annotator based on marker genes” to do.

PrimozGodec

comment created time in 10 days

pull request commenttheislab/scanpy

The method for annotating genes with cell types

Thank you. We’re using pytest though, so please write the tests that way:

  1. Remove the class and make all its methods top-level functions
  2. Make setUp into fixtures
  3. Just use assert
@pytest.fixture
def markers():
    return pd.DataFrame(
        ...
    )


@pytest.fixture
def adata():
    ...
    return AnnData(data.values, var=data.columns.values)


def test_remove_empty_column(adata, markers):
    ...
    annotations = annotator(adata, markers, num_genes=20)
    ...
    assert len(annotations) == len(self.anndata)
    ...
PrimozGodec

comment created time in 10 days

issue closedtheislab/scanpy

Integrate data from different treatments and perform differential gene expression analysis according to treatment and cell type

Hi,

First of all, I would like to thank the developers for this awesome tool! I am new to Scanpy. I am migrating from Seurat to Scanpy as I would like to perform trajectory analysis in my data.

I have single cell sequencing data from 12 samples and 3 treatments (so 4 samples per treatment). I merged the samples from the same treatment in a single matrix using ‘cellranger’ software from 10x Genomics (so I have 3 matrixes from 3 different treatments to import to Scanpy).

In ‘Seurat’, I can read the data from my three treatments separated, do quality control, and then integrate them using ‘FindIntegrationAnchors’ and ‘IntegrateData’ functions. Then, I perform cluster analysis in the integrated dataset, and test the effect of treatment on the transcriptome of each cluster.

Is there a similar function in ‘Scanpy’ to integrate different datasets which are labeled in order to perform cluster analyses in the integrated dataset and test for the effect of treatment in the transcriptome of identified cell types? If so, is there a tutorial for that?

In ‘Scanpy’ I am able to import the data and perform quality control and cluster analysis. Thus, if there was a way of integrating the 3 different matrixes in one single object that would be helpful.

Any suggestions on how I should proceed to integrate my data and perform differential gene expression analysis according to treatment and cell type?

Thank you very much!

Joao

closed time in 10 days

JoaoGabrielMoraes

issue commenttheislab/scanpy

Integrate data from different treatments and perform differential gene expression analysis according to treatment and cell type

Hi! Welcome to the community. For questions like this, https://scanpy.discourse.group/ would be the ideal place!

Generally: If you can’t find what you search in the regular API, you can always try scanpy.external, where you should e.g. find answers for your first question. I don’t think we have a tutorial for this yet, though.

For simply concatenateing, multiple AnnData objects, the AnnData docs should help you out.

JoaoGabrielMoraes

comment created time in 10 days

issue commenttheislab/anndata

scanpy.tl.score_genes fails when run on backed adata

Looks like an anndata issue, not a scanpy issue.

bkmartinjr

comment created time in 10 days

issue commenttheislab/anndata2ri

AttributeError: 'Converter' object has no attribute 'py2rpy'

Do you have all dependencies correctly installed?

https://github.com/theislab/anndata2ri/blob/635da2f0c2720019d47efaa26c244668e371c5f4/pyproject.toml#L22-L25

sergiopolid

comment created time in 10 days

push eventflying-sheep/vscode-context

Philipp A

commit sha 02ca6a47786669f507bb95f6fccd0ea19bd4ba7b

Tests work

view details

push time in 10 days

CommitCommentEvent

create barnchflying-sheep/vscode-context

branch : master

created branch time in 14 days

created repositoryflying-sheep/vscode-context

created time in 14 days

issue closedIRkernel/IRdisplay

how to view tif file?

closed time in 14 days

kongdd

issue commentIRkernel/IRdisplay

how to view tif file?

By converting them to a file format Jupyter supports. The R image format libraries return a matrix or a nativeRaster file. Displaying the latter as image isn’t supported by repr (but maybe could be), so:

display_png(png::writePNG(tiff::readTIFF("origin.tiff", native=TRUE)))
kongdd

comment created time in 14 days

pull request commentJetBrains/idea-gitignore

Enforce C locale

Sorry, I don’t have the time right now, but you could probably remember me next month.

flying-sheep

comment created time in 14 days

pull request commenttheislab/anndata

Improve alignedmapping typing

Isn’t pd.DataFrame(adata.X).do_stuff() easy enough? I’m pretty sure it creates a slim view over the array, so it should be pretty much for free to do this on the fly.

Another thing I want to enforce: No layer called X, '' or None, as I want to unify layers and X down the road.

flying-sheep

comment created time in 14 days

pull request commenttheislab/anndata

[WIP] Faster sparse backed access

Should we allow appending sparse matrices when they don't match in size on the non-extending axis?

We shouldn’t, you’re right. I don’t like broadcasting, it’s too surprising if you didn’t intend to to it.

I think the user should have some control over which elements are written as dense/ read as sparse. This should also not be set on the file, but set by the arguments used to read it.

Agreed that this usage makes more sense but why is any of this needed at all for h5ad? I can see it making sense for reading formats that don’t support sparse matrices, but for h6ad:

Reading dense as sparse means someone didn’t use the right format for storing their matrix in the AnnData object before writing. Writing sparse as dense seems like it could be useful for compatibility with readers from other programming languages. But in that case, writing a reader that supports our sparse representation seems like the better idea.

These would both be of type List[Union[Tuple[str, str], str]]

Never List, in a parameter type, but I’m also not a fan of the whole signature. This is basically paths right? I think the signature should be Mapping[str, Sequence[Hashable]], with {'layers': None | 'X'} referring to adata.X

This being said, we should ban calling a layer 'X' or '' or None because I want to unify layers and X.

ivirshup

comment created time in 14 days

Pull request review commenttheislab/anndata

[WIP] Faster sparse backed access

     asview,     _resolve_idxs, )-from .. import h5py, utils+from .sparsedataset import SparseDataset

Let’s call it it “storage” then!

ivirshup

comment created time in 14 days

pull request commenttheislab/anndata

Support for ordered categoricals.

I’m happy with this!

ivirshup

comment created time in 14 days

pull request commenttheislab/scanpy

Cell hashing

Hmm, I’m pretty happy with my self-documented test code:

    def test_deferred_imports(imported_modules):
        slow_to_import = {
            'umap',             # neighbors, tl.umap
            'seaborn',          # plotting
            'sklearn.metrics',  # neighbors
            'scipy.stats',      # tools._embedding_density
            'networkx',         # diffmap, paga, plotting._utils
            # TODO: 'matplotlib.pyplot',
            # TODO (maybe): 'numba',
        }
        falsely_imported = slow_to_import & imported_modules
>       assert not falsely_imported
E       AssertionError: assert not {'scipy.stats'}

Do you think this could be clearer?

njbernstein

comment created time in 14 days

Pull request review commenttheislab/scanpy

Improve accuracy and speed of _get_mean_var

 import numpy as np-from scipy.sparse import issparse+from scipy import sparse+import numba  -STANDARD_SCALER_FIXED = False+def _get_mean_var(X):+    if sparse.issparse(X):+        mean, var = sparse_mean_variance_axis(X, axis=0)+    else:+        mean = np.mean(X, axis=0, dtype=np.float64)+        mean_sq = np.multiply(X, X).mean(axis=0, dtype=np.float64)+        var = mean_sq - mean ** 2+    # enforce R convention (unbiased estimator) for variance+    var *= X.shape[0] / (X.shape[0] - 1)+    return mean, var  -def _get_mean_var(X):-    # - using sklearn.StandardScaler throws an error related to-    #   int to long trafo for very large matrices-    # - using X.multiply is slower-    if not STANDARD_SCALER_FIXED:-        mean = X.mean(axis=0)-        if issparse(X):-            mean_sq = X.multiply(X).mean(axis=0)-            mean = mean.A1-            mean_sq = mean_sq.A1-        else:-            mean_sq = np.multiply(X, X).mean(axis=0)-        # enforece R convention (unbiased estimator) for variance-        var = (mean_sq - mean ** 2) * (X.shape[0] / (X.shape[0] - 1))+def sparse_mean_variance_axis(mtx: sparse.spmatrix, axis: int):+    """+    This code and internal functions are based on sklearns+    `sparsefuncs.mean_variance_axis`.++    Modifications:+    * allow deciding on the output type, which can increase accuracy when calculating the mean and variance of 32bit floats.+    * This doesn't currently implement support for null values, but could.+    * Uses numba not cython+    """+    assert axis in (0, 1)+    if isinstance(mtx, sparse.csr_matrix):+        if axis == 0:+            return sparse_mean_var_minor_axis(+                mtx.data, mtx.indices, mtx.shape[0], mtx.shape[1], np.float64+            )+        elif axis == 1:+            return sparse_mean_var_major_axis(+                mtx.data,+                mtx.indices,+                mtx.indptr,+                mtx.shape[0],+                mtx.shape[1],+                np.float64,+            )+    elif isinstance(mtx, sparse.csc_matrix):+        if axis == 0:+            return sparse_mean_var_major_axis(+                mtx.data,+                mtx.indices,+                mtx.indptr,+                mtx.shape[1],+                mtx.shape[0],+                np.float64,+            )+        elif axis == 1:+            return sparse_mean_var_minor_axis(+                mtx.data, mtx.indices, mtx.shape[1], mtx.shape[0], np.float64+            )     else:-        from sklearn.preprocessing import StandardScaler+        raise ValueError(+            "This function only works on sparse csr and csc matrices"+        )

My code is the complete thing, so nothing has to be rewritten.

If you think my version is too cute, you can use better variable names or as above an if else expression instead of converting the bool to an int.

I’m also OK with your version if you replace the last function with:

@sparse_mean_variance_axis.register(sparse.csc_matrix)
def sparse_mean_variance_axis_csc(mtx, axis):
    # use csr implementation
    return sparse_mean_variance_axis(mtx.T, 1 - axis)
ivirshup

comment created time in 14 days

Pull request review commenttheislab/scanpy

Improve accuracy and speed of _get_mean_var

 import numpy as np-from scipy.sparse import issparse+from scipy import sparse+import numba  -STANDARD_SCALER_FIXED = False+def _get_mean_var(X):+    if sparse.issparse(X):+        mean, var = sparse_mean_variance_axis(X, axis=0)+    else:+        mean = np.mean(X, axis=0, dtype=np.float64)+        mean_sq = np.multiply(X, X).mean(axis=0, dtype=np.float64)+        var = mean_sq - mean ** 2+    # enforce R convention (unbiased estimator) for variance+    var *= X.shape[0] / (X.shape[0] - 1)+    return mean, var  -def _get_mean_var(X):-    # - using sklearn.StandardScaler throws an error related to-    #   int to long trafo for very large matrices-    # - using X.multiply is slower-    if not STANDARD_SCALER_FIXED:-        mean = X.mean(axis=0)-        if issparse(X):-            mean_sq = X.multiply(X).mean(axis=0)-            mean = mean.A1-            mean_sq = mean_sq.A1-        else:-            mean_sq = np.multiply(X, X).mean(axis=0)-        # enforece R convention (unbiased estimator) for variance-        var = (mean_sq - mean ** 2) * (X.shape[0] / (X.shape[0] - 1))+def sparse_mean_variance_axis(mtx: sparse.spmatrix, axis: int):+    """+    This code and internal functions are based on sklearns+    `sparsefuncs.mean_variance_axis`.++    Modifications:+    * allow deciding on the output type, which can increase accuracy when calculating the mean and variance of 32bit floats.+    * This doesn't currently implement support for null values, but could.+    * Uses numba not cython+    """+    assert axis in (0, 1)+    if isinstance(mtx, sparse.csr_matrix):+        if axis == 0:+            return sparse_mean_var_minor_axis(+                mtx.data, mtx.indices, mtx.shape[0], mtx.shape[1], np.float64+            )+        elif axis == 1:+            return sparse_mean_var_major_axis(+                mtx.data,+                mtx.indices,+                mtx.indptr,+                mtx.shape[0],+                mtx.shape[1],+                np.float64,+            )+    elif isinstance(mtx, sparse.csc_matrix):+        if axis == 0:+            return sparse_mean_var_major_axis(+                mtx.data,+                mtx.indices,+                mtx.indptr,+                mtx.shape[1],+                mtx.shape[0],+                np.float64,+            )+        elif axis == 1:+            return sparse_mean_var_minor_axis(+                mtx.data, mtx.indices, mtx.shape[1], mtx.shape[0], np.float64+            )     else:-        from sklearn.preprocessing import StandardScaler+        raise ValueError(+            "This function only works on sparse csr and csc matrices"+        )

please condense this to two branches instead of 4.

csr = isinstance(mtx, sparse.csr_matrix)
if not csr and not isinstance(mtx, sparse.csc_matrix):
    raise ValueError(
        "This function only works on sparse csr and csc matrices"
    )
shape = mtx.shape if csr else mtx.shape[::-1]
if axis == int(csr):
    return sparse_mean_var_major_axis(
        mtx.data, mtx.indices, mtx.indptr, *shape, np.float64
    )
else:
    return sparse_mean_var_minor_axis(
        mtx.data, mtx.indices, *shape, np.float64
    )
ivirshup

comment created time in 15 days

issue commentjupyter/jupyter_console

Jupyter console configuration for colors

Hmm, can someone please help? I tried

ipython --ZMQTerminalInteractiveShell.highlighting_style_overrides='{"Token.Prompt": "ansigreen"}'

But nothing changed! References:

  • https://github.com/jupyter/jupyter_console/blob/5755ec42a95b70492813be4a6bb8342015126062/jupyter_console/ptshell.py#L413-L438
  • https://github.com/prompt-toolkit/python-prompt-toolkit/blob/d3ad16dd7baf50267b7c6ee8b56d6f49008075f6/prompt_toolkit/styles/base.py#L51-L61
unode

comment created time in 15 days

pull request commenttheislab/scanpy

update ValueError message in pca

Thank you!

fabianrost84

comment created time in 15 days

push eventtheislab/scanpy

Fabian Rost

commit sha d8f32c040f3a5f4fc07998b269796ca58de84b40

update ValueError message in pca (#858) replaced the deprecated filter_gene_dispersion with highly_variable_genes

view details

push time in 15 days

PR merged theislab/scanpy

update ValueError message in pca

replaced the deprecated filter_gene_dispersion with highly_variable_genes

+1 -1

0 comment

1 changed file

fabianrost84

pr closed time in 15 days

pull request commenttheislab/scanpy

Improve accuracy and speed of _get_mean_var

Great, then let’s get rid of that branch!

ivirshup

comment created time in 15 days

pull request commentJetBrains/idea-gitignore

Enforce C locale

You can still merge this. It’ll enforce correct behavior if you at any point want to parse git output.

flying-sheep

comment created time in 15 days

more