profile
viewpoint

najafmurtaza/General_Sentence_Embeddings 2

Extract Sentence Embeddings from Hugging Face pre-trained models.

najafmurtaza/Analytica 1

Analytica is a data analysis & manipulation library which has basic functionality like pandas. It is created from scratch in numpy to get the sense how things work behind the scenes.

najafmurtaza/Developing-Machine-Learning-Models-in-Flask 0

Flask for training/testing Watson, FastText, Gensen Embeddings and hDBScan. Plus it supports MLFlow for model info logging.

najafmurtaza/gensen 0

Learning General Purpose Distributed Sentence Representations via Large Scale Multi-task Learning

najafmurtaza/SentEval 0

A python tool for evaluating the quality of sentence embeddings.

startedJohnGiorgi/DeCLUTR

started time in 2 months

PR opened Maluuba/gensen

fixed Encoder and GensenSingle for CPU
  1. GensenSingle was loading model directly to GPU instead of checking with self.cuda().
  2. Encoder had bug with try,except where it was moving all tensors to GPU by default. Now added cuda attribute, which moves the data using self.cuda() checks.
+21 -12

0 comment

1 changed file

pr created time in 2 months

push eventnajafmurtaza/gensen

Najaf Murtaza

commit sha f1c5f3f6aa1583e02b148f908d9cf9edf4142ba6

fixed Encoder and GensenSingle for CPU 1. GensenSingle was loading model directly to GPU instead of checking with self.cuda(). 2. Encoder had bug with try,except where it was moving all tensors to GPU by default. Now added cuda attribute, which moves the data using self.cuda() checks.

view details

push time in 2 months

fork najafmurtaza/gensen

Learning General Purpose Distributed Sentence Representations via Large Scale Multi-task Learning

fork in 2 months

push eventnajafmurtaza/gensen

Najaf Murtaza

commit sha 4d2731ce4f4ffe8814817de27c12624357c2774c

fixed Encoder and GensenSingle for CPU 1. GensenSingle was loading model directly to GPU instead of checking with self.cuda(). 2. Encoder had bug with try,except where it was moving all tensors to GPU by default. Now added cuda attribute, which moves the data using self.cuda() checks.

view details

push time in 2 months

fork najafmurtaza/gensen

Learning General Purpose Distributed Sentence Representations via Large Scale Multi-task Learning

fork in 2 months

push eventnajafmurtaza/gensen

Najaf Murtaza

commit sha cb3a99a285b6b457859a834c25579a776239626e

fixed Encoder and GensenSingle for CPU 1. GensenSingle was loading model directly to GPU instead of checking with self.cuda(). 2. Encoder had bug with try,except where it was moving all tensors to GPU by default. Now added cuda attribute, which moves the data using self.cuda() checks.

view details

push time in 2 months

fork najafmurtaza/gensen

Learning General Purpose Distributed Sentence Representations via Large Scale Multi-task Learning

fork in 2 months

push eventnajafmurtaza/Developing-Machine-Learning-Models-in-Flask

Najaf Murtaza

commit sha ac097a82f81a7b0b2d8228b1429bdd4d20f2daca

Initial commit

view details

push time in 2 months

create barnchnajafmurtaza/Developing-Machine-Learning-Models-in-Flask

branch : master

created branch time in 2 months

created repositorynajafmurtaza/Developing-Machine-Learning-Models-in-Flask

created time in 2 months

create barnchnajafmurtaza/Models_in_Flask

branch : master

created branch time in 2 months

created repositorynajafmurtaza/Models_in_Flask

created time in 2 months

push eventnajafmurtaza/General_Sentence_Embeddings

Najaf Murtaza

commit sha 5427bccd3f9033c19d27e665ad93024a4f6f5b9d

Update README.md

view details

push time in 2 months

push eventnajafmurtaza/General_Sentence_Embeddings

Najaf Murtaza

commit sha 4036e0b020fa1bc54a7e74509f1861c58106182f

fixed max_length explanation comment

view details

push time in 2 months

push eventnajafmurtaza/General_Sentence_Embeddings

Najaf Murtaza

commit sha cdb5d41917b05733af259db60ed557fb481e1302

Update README.md

view details

push time in 2 months

push eventnajafmurtaza/General_Sentence_Embeddings

Najaf Murtaza

commit sha 6f1598fba3b8a18a97da1ca3f35db67969f3a3cf

fixed max_length explanation comment

view details

push time in 2 months

push eventnajafmurtaza/General_Sentence_Embeddings

Najaf Murtaza

commit sha b098d91d36010a28eee4cf55029fc24de02d6873

Initial commit

view details

push time in 2 months

push eventnajafmurtaza/General_Sentence_Embeddings

Najaf Murtaza

commit sha 1dc4853271b2c25adb10e7be191b80b38c0e6192

Create README.md

view details

push time in 2 months

create barnchnajafmurtaza/Sentence_Embeddings

branch : master

created branch time in 3 months

created repositorynajafmurtaza/Sentence_Embeddings

Extract Sentence Embeddings from Hugging Face pre-trained models

created time in 3 months

PR opened facebookresearch/SentEval

Fixed examples/gensen.py

examples/gensen.py, batcher func has undefined gensen and sentences variables. Now they are replaced with their appropriate var names.

+2 -5

0 comment

1 changed file

pr created time in 3 months

push eventnajafmurtaza/SentEval

Najaf Murtaza

commit sha 58b9f364db5176345539591f871b0e0feecdc53a

Fixed examples/gensen.py ```examples/gensen.py, batcher``` func has undefined ```gensen``` and ```sentences``` variables. Now they are replaced with their appropriate var names.

view details

push time in 3 months

fork najafmurtaza/SentEval

A python tool for evaluating the quality of sentence embeddings.

fork in 3 months

startedNovetta/adaptnlp

started time in 3 months

issue closedUKPLab/sentence-transformers

Confused about these 3 usages

Hi @nreimers,

what is the difference between Sentence Embeddings with Transformers, Loading custom BERT models and Models.

From my understanding, Loading custom BERT models is for models which are fine-tuned/trained using Sentence-Transformer lib. Sentence Embeddings with Transformers is to directly use Hugging Face models instead of using sentence_transformer. Models is low level of Loading custom BERT models

But if we can use Sentence Embeddings with Transformers then whats the usage/advantage of other two?

closed time in 3 months

najafmurtaza

issue commentUKPLab/sentence-transformers

Confused about these 3 usages

In conclusion, In order to use sentence transformer lib. either use Pre-trained models available in repo or train/finetune a new one. Right?

najafmurtaza

comment created time in 3 months

issue commentUKPLab/sentence-transformers

Confused about these 3 usages

What I am thinking to achieve is to use Hugging Face AlexNet/T5 model but with sentence-transformer.

model = SetenceTransformer('T5/AlexNet') embds = model.encode(sentences)

Is that possible?

najafmurtaza

comment created time in 3 months

issue openedUKPLab/sentence-transformers

Confused about these 3 usages

Hi @nreimers,

what is the difference between Sentence Embeddings with Transformers, Loading custom BERT models and Models.

From my understanding, Loading custom BERT models is for models which are fine-tuned/trained using Sentence-Transformer lib. Sentence Embeddings with Transformers is to directly use Hugging Face models instead of using sentence_transformer. Models is low level of Loading custom BERT models

But if we can use Sentence Embeddings with Transformers then whats the usage/advantage of other two?

created time in 3 months

startedMaluuba/gensen

started time in 3 months

startedUKPLab/sentence-transformers

started time in 3 months

startedcraffel/dl3d-seminar

started time in 3 months

issue closedgoogle-research/google-research

RuntimeError: Error initializing searcher:

Code:

def f(x):
       return np.random.rand(100)

train_data['emb'] = train_data['text'].apply(f)
test_data['emb'] = test_data['text'].apply(f)

dataset = np.array(train_data['emb'].tolist(), dtype=np.float32)
queries = np.array(test_data['emb'].tolist(), dtype=np.float32)

Error:

normalized_dataset = dataset / np.linalg.norm(dataset, axis=1)[:, np.newaxis]
# configure ScaNN as a tree - asymmetric hash hybrid with reordering
# anisotropic quantization as described in the paper; see README

#  ----- At this line -----

searcher = scann.ScannBuilder(normalized_dataset, 10, "dot_product").tree(
    num_leaves=2000, num_leaves_to_search=100, training_sample_size=250000).score_ah(
    2, anisotropic_quantization_threshold=0.2).reorder(100).create_pybind()

Traceback (most recent call last): File "/home/dev/.local/lib/python3.6/site-packages/IPython/core/interactiveshell.py", line 3331, in run_code exec(code_obj, self.user_global_ns, self.user_ns) File "<ipython-input-7-c748bb36181c>", line 6, in <module> 2, anisotropic_quantization_threshold=0.2).reorder(100).create_pybind() File "/home/dev/.local/lib/python3.6/site-packages/scann/scann_ops/py/scann_builder.py", line 203, in create_pybind self.training_threads) File "/home/dev/.local/lib/python3.6/site-packages/scann/scann_ops/py/scann_ops_pybind.py", line 60, in create_searcher scann_pybind.ScannNumpy(db, scann_config, training_threads)) RuntimeError: Error initializing searcher:

closed time in 3 months

najafmurtaza

issue commentgoogle-research/google-research

RuntimeError: Error initializing searcher:

The issue is resolved. It turns out, searcher expects lots of training(dataset) examples. In my case I had only 50 in dataset var.

najafmurtaza

comment created time in 3 months

issue openedgoogle-research/google-research

RuntimeError: Error initializing searcher:

Code:
def f(x):
       return np.random.rand(100)

train_data['emb'] = train_data['text'].apply(f)
test_data['emb'] = test_data['text'].apply(f)

Traceback (most recent call last):
  File "/home/dev/.local/lib/python3.6/site-packages/IPython/core/interactiveshell.py", line 3331, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-7-c748bb36181c>", line 6, in <module>
    2, anisotropic_quantization_threshold=0.2).reorder(100).create_pybind()
  File "/home/dev/.local/lib/python3.6/site-packages/scann/scann_ops/py/scann_builder.py", line 203, in create_pybind
    self.training_threads)
  File "/home/dev/.local/lib/python3.6/site-packages/scann/scann_ops/py/scann_ops_pybind.py", line 60, in create_searcher
    scann_pybind.ScannNumpy(db, scann_config, training_threads))
RuntimeError: Error initializing searcher: 

created time in 3 months

more