Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why some specified "nrom_cells" were classified as tumor cells? #129

Open
Zjianglin opened this issue Dec 13, 2024 · 0 comments
Open

Why some specified "nrom_cells" were classified as tumor cells? #129

Zjianglin opened this issue Dec 13, 2024 · 0 comments

Comments

@Zjianglin
Copy link

Zjianglin commented Dec 13, 2024

Hi developer, I'm using SCEVAN to identify tumor cells in a LiverCancer sample (10X scRNA-seq). I supposed that all immune cells (i.e. Neutrophils, Myeloid, T, B cells) are normal cells and their cell barcodes were inputed to norm_cell parameter?

However, the SCEVAN results suggests some immune cells were classified as tumor cells, why this happens? Is the norm_cell parameter does not work?

btw. the running output (found 4451 tumor cells) does not equal to result data.frame class count
(> table(sce8154$class) filtered normal tumor 1 4177 4406 ).

analysis code:

normal_cts <- c('Neutrophils', 'Kupffer', 'Proli. Kupffer',
                'Clec4f-Macro', 'Monocyte', 'pDC', 'cDC1', 'cDC2',
                'Cd4+T','Cd8+T', 'NK',  'Plasmocyte', 'B cell', 'Basophils'
                )

raw_count_li <- list()
norm_cells_li <- list()

for (samplex in sort(unique(sc_sbj$orig.ident))) {
  tmp_sbj = subset(sc_sbj, subset=(!!rlang::sym(sample_key) == samplex), slot = 'counts')
  refmtxx = tmp_sbj@meta.data
  refmtxx = refmtxx[(refmtxx[[celltype_key]] %in% normal_cts), ]
  ref_cellids = rownames(refmtxx) #Cells(nrm_sbj)
  nrm_strx = ''
  cntx = table(refmtxx[[celltype_key]])
  for (x in names(cntx)) {
    nrm_strx = sprintf('%s, %s:%d', nrm_strx, x, cntx[x])
  }

  print(sprintf('%s: %s, Normal cells (%d): %s', samplex, toString(dim(tmp_sbj)), nrow(refmtxx), nrm_strx))

  raw_count_li[[samplex]] = GetAssayData(tmp_sbj, assay='RNA', slot='counts')
  norm_cells_li[[samplex]] = ref_cellids
}

for (samplex in sort(unique(sc_sbj$orig.ident))) {
                cat(sprintf('\n\tSCEVAN: Sample %s (%d nrom_cells) \n', samplex, length(norm_cells_li[[samplex]])))
                print(head(norm_cells_li[[samplex]]))
                scevan_sres <- pipelineCNA(raw_count_li[[samplex]],
                         sample = samplex,
                         par_cores = nthreads,
                         norm_cell = norm_cells_li[[samplex]],
                         SUBCLONES = T,
                         beta_vega = 0.5,
                         ClonalCN = T,
                         plotTree = T,
                         AdditionalGeneSets = NULL,
                         SCEVANsignatures = TRUE,
                         organism = "mouse")

        }



running output:

SCEVAN: Sample 8154 (7179 nrom_cells) 
[1] "AAACCCAGTATCTCGA-8154" "AAACCCAGTCTCACGG-8154" "AAACCCAGTGTTCGTA-8154"
[4] "AAACGAAAGAAGGATG-8154" "AAACGAAAGATTGCGG-8154" "AAACGAAAGCGGACAT-8154"
[1] " raw data - genes: 24413 cells: 8584"
[1] "1) Filter: cells > 200 genes"
[1] "low data quality"
[1] "2) Filter: genes > 5% of cells"
[1] "8898 genes past filtering"
[1] "3) Annotations gene coordinates"
[1] "8062 genes annotated"
[1] "4) Filter: genes involved in the cell cycle"
[1] "7686 genes past filtering "
[1] "5)  Filter: cells > 5genes per chromosome "
[1] "6) Log Freeman Turkey transformation"
[1] "A total of 8583 cells, 7686 genes after preprocessing"
[1] "7) Measuring baselines (confident normal cells)"
[1] "8) Smoothing data"
[1] "9) Segmentation (VegaMC)"
[1] "10) Adjust baseline"
[1] "11) plot heatmap"
[1] "found 4451 tumor cells"
[1] "time classify tumor cells:  6.12251765330633"

results:

> table(sce8154$class)
filtered   normal    tumor 
       1     4177     4406 
> table(sce8154[, c('anno1', 'class')])
                class
anno1            filtered normal tumor
  Hepatocyte            0     45   390
  Cholangiocytes        0      0     0
  Endothelial           0      1   908
  HSC                   0      0    61
  Neutrophils           0      0   366
  Kupffer               0    583    39
  Proli. Kupffer        0    193     2
  Clec4f-Macro          0   1731    41
  Monocyte              1    949   468
  pDC                   0     15   133
  cDC1                  0    177    12
  cDC2                  0    477    93
  Cd4+T                 0      0   145
  Cd8+T                 0      2  1110
  NK                    0      0   111
  Plasmocyte            0      0     4
  B cell                0      4   502
  Basophils             0      0    21

>head(sce8154[, 1:6], n=10)
              cell.names  class confidentNormal subclone        anno1 SCEVANpred
1  AAACCCAGTATCTCGA-8154 normal             yes       NA         cDC2     normal
2  AAACCCAGTCTCACGG-8154  tumor             yes        1       B cell      tumor
3  AAACCCAGTGTTCGTA-8154  tumor             yes        2      Kupffer      tumor
4  AAACGAAAGAAGGATG-8154 normal             yes       NA Clec4f-Macro     normal
5  AAACGAAAGATTGCGG-8154  tumor             yes        3        Cd8+T      tumor
6  AAACGAAAGCGGACAT-8154 normal             yes       NA         cDC1     normal
7  AAACGAAAGTCATGGG-8154 normal             yes       NA     Monocyte     normal
8  AAACGAACAATCTGCA-8154  tumor             yes        1       B cell      tumor
9  AAACGAAGTGTACATC-8154 normal             yes       NA Clec4f-Macro     normal
10 AAACGAAGTTGTTGTG-8154  tumor             yes        3        Cd8+T      tumor

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant