profile
viewpoint
Favio André Vázquez FavioVazquez Life México https://www.linkedin.com/in/faviovazquez Physicist and computational engineer. I have a passion for science, philosophy, programming, and lacanian psychoanalysis. Working on cosmology and big data.

push eventironmussa/Optimus

pyup-bot

commit sha 8321c7e7398fc92219001da3e005f0cd96998a2a

Update pymssql from 2.1.4 to 3.0.3

view details

push time in 21 hours

create barnchironmussa/Optimus

branch : pyup-update-pymssql-2.1.4-to-3.0.3

created branch time in 21 hours

delete branch ironmussa/Optimus

delete branch : pyup-update-pytest-4.6.2-to-5.2.3

delete time in 21 hours

push eventironmussa/Optimus

pyup-bot

commit sha 4fd5c5e5155d1c91bde8d74d5fa7b902ab1af1ea

Update pytest from 4.6.2 to 5.2.4

view details

push time in 21 hours

push eventironmussa/Optimus

pyup-bot

commit sha c723ce40deb2786de5a92f354186ba0d872c1d93

Update pytest from 4.6.2 to 5.2.4

view details

push time in 21 hours

create barnchironmussa/Optimus

branch : pyup-update-pytest-4.6.2-to-5.2.4

created branch time in 21 hours

push eventironmussa/Optimus

pyup-bot

commit sha 72716d45e8964446a1745635ac2559521cd2809b

Update pytest from 4.6.2 to 5.2.3

view details

push time in 2 days

create barnchironmussa/Optimus

branch : pyup-update-pytest-4.6.2-to-5.2.3

created branch time in 2 days

push eventironmussa/Optimus

pyup-bot

commit sha 2c51c5f29b751d1f4c5cf67b53b0351f79a21f37

Update pytest from 4.6.2 to 5.2.3

view details

push time in 2 days

pull request commentironmussa/Optimus

Improve string_to_index() and sort() in columns

Codacy Here is an overview of what got changed by this pull request:


Issues
======
+ Solved 1
           

See the complete overview on Codacy

argenisleon

comment created time in 2 days

delete branch ironmussa/Optimus

delete branch : pyup-update-pylint-2.3.1-to-2.4.3

delete time in 2 days

push eventironmussa/Optimus

pyup-bot

commit sha 62b1a125a029d936708f52596f2e957dcfc0a6aa

Update pylint from 2.3.1 to 2.4.4

view details

push time in 2 days

create barnchironmussa/Optimus

branch : pyup-update-pylint-2.3.1-to-2.4.4

created branch time in 2 days

Pull request review commentironmussa/Optimus

Fix profiler frequency info

 def to_file(self, path=None, output="html"):      def to_json(self, df, columns="*", buckets=10, infer=False, relative_error=RELATIVE_ERROR, approx_count=True,                 sample=10000, stats=True, mismatch=None):-        columns, output = self.dataset(df, columns=columns, buckets=buckets, infer=infer, relative_error=relative_error,-                                       approx_count=approx_count,-                                       sample=sample, stats=stats, format="json", mismatch=mismatch)-        return output+        return self.dataset(df, columns=columns, buckets=buckets, infer=infer, relative_error=relative_error,+                            approx_count=approx_count,+                            sample=sample, stats=stats, format="json", mismatch=mismatch)++    def cols_needs_profiling(self, df, columns):+        """+        Calculate the columns that needs to be profiled.+        :return:+        """+        # Metadata+        # If not empty the profiler already run.+        # So process the dataframe's metadata to be sure which columns need to be profiled++        actions = df.get_meta("transformations.actions")+        are_actions = actions is not None and len(actions) > 0++        # Process actions to check if any column must be processed+        if self.is_cached():+            if are_actions:++                drop = ["drop"]++                def match_actions_names(_actions):+                    """+                    Get a list of columns which have been applied and specific action.+                    :param _actions:+                    :return:+                    """++                    _actions_json = df.get_meta("transformations.actions")++                    modified = []+                    for action in _actions:+                        if _actions_json.get(action):+                            # Check if was renamed+                            col = _actions_json.get(action)+                            if len(match_renames(col)) == 0:+                                _result = col+                            else:+                                _result = match_renames(col)+                            modified = modified + _result++                    return modified++                def match_renames(_col_names):+                    """+                    Get a list fo columns and return the renamed version.+                    :param _col_names:+                    :return:+                    """+                    _renamed_columns = []+                    _actions = df.get_meta("transformations.actions")+                    _rename = _actions.get("rename")++                    def get_name(_col_name):+                        c = _rename.get(_col_name)+                        # The column has not been rename. Get the actual column name+                        if c is None:+                            c = _col_name+                        return c++                    if _rename:+                        # if a list+                        if is_list_of_str(_col_names):+                            for _col_name in _col_names:+                                # The column name has been changed. Get the new name+                                _renamed_columns.append(get_name(_col_name))+                        # if a dict+                        if is_dict(_col_names):+                            for _col1, _col2 in _col_names.items():+                                _renamed_columns.append({get_name(_col1): get_name(_col2)})++                    else:+                        _renamed_columns = _col_names+                    return _renamed_columns++                # New columns+                new_columns = []++                current_col_names = df.cols.names()+                renamed_cols = match_renames(df.get_meta("transformations.columns"))+                for current_col_name in current_col_names:+                    if current_col_name not in renamed_cols:+                        new_columns.append(current_col_name)++                # Rename keys to match new names+                profiler_columns = self.output_columns["columns"]+                actions = df.get_meta("transformations.actions")+                rename = actions.get("rename")+                if rename:+                    for k, v in actions["rename"].items():+                        profiler_columns[v] = profiler_columns.pop(k)+                        profiler_columns[v]["name"] = v++                # Drop Keys+                for col_names in match_actions_names(drop):+                    profiler_columns.pop(col_names)++                # Copy Keys+                copy_columns = df.get_meta("transformations.actions.copy")+                if copy_columns is not None:+                    for source, target in copy_columns.items():+                        profiler_columns[target] = profiler_columns[source].copy()+                        profiler_columns[target]["name"] = target+                    # Check is a new column is a copied column+                    new_columns = list(set(new_columns) - set(copy_columns.values()))++                # Actions applied to current columns++                modified_columns = match_actions_names(Actions.list())+                calculate_columns = modified_columns + new_columns++                # Remove duplicated.+                calculate_columns = list(set(calculate_columns))++            elif not are_actions:+                calculate_columns = None+            # elif not is_cached:+        else:+            calculate_columns = columns++        return calculate_columns++    def is_cached(self):+        """++        :return:+        """+        return len(self.output_columns) > 0      def dataset(self, df, columns="*", buckets=10, infer=False, relative_error=RELATIVE_ERROR, approx_count=True,-                sample=10000, stats=True, format="dict", mismatch=None):+                sample=10000, stats=True, format="dict", mismatch=None, advanced_stats=False):
argenisleon

comment created time in 2 days

pull request commentironmussa/Optimus

Fix profiler frequency info

Codacy Here is an overview of what got changed by this pull request:


Issues
======
+ Solved 3
- Added 1
           

Complexity increasing per file
==============================
- optimus/ml/distancecluster.py  2
         

Complexity decreasing per file
==============================
+ optimus/profiler/profiler.py  -7
         

See the complete overview on Codacy

argenisleon

comment created time in 2 days

Pull request review commentironmussa/Optimus

Fix profiler frequency info

 def to_file(self, path=None, output="html"):      def to_json(self, df, columns="*", buckets=10, infer=False, relative_error=RELATIVE_ERROR, approx_count=True,                 sample=10000, stats=True, mismatch=None):-        columns, output = self.dataset(df, columns=columns, buckets=buckets, infer=infer, relative_error=relative_error,-                                       approx_count=approx_count,-                                       sample=sample, stats=stats, format="json", mismatch=mismatch)-        return output+        return self.dataset(df, columns=columns, buckets=buckets, infer=infer, relative_error=relative_error,+                            approx_count=approx_count,+                            sample=sample, stats=stats, format="json", mismatch=mismatch)++    def cols_needs_profiling(self, df, columns):+        """+        Calculate the columns that needs to be profiled.+        :return:+        """+        # Metadata+        # If not empty the profiler already run.+        # So process the dataframe's metadata to be sure which columns need to be profiled++        actions = df.get_meta("transformations.actions")+        are_actions = actions is not None and len(actions) > 0++        # Process actions to check if any column must be processed+        if self.is_cached():+            if are_actions:++                drop = ["drop"]++                def match_actions_names(_actions):+                    """+                    Get a list of columns which have been applied and specific action.+                    :param _actions:+                    :return:+                    """++                    _actions_json = df.get_meta("transformations.actions")++                    modified = []+                    for action in _actions:+                        if _actions_json.get(action):+                            # Check if was renamed+                            col = _actions_json.get(action)+                            if len(match_renames(col)) == 0:+                                _result = col+                            else:+                                _result = match_renames(col)+                            modified = modified + _result++                    return modified++                def match_renames(_col_names):+                    """+                    Get a list fo columns and return the renamed version.+                    :param _col_names:+                    :return:+                    """+                    _renamed_columns = []+                    _actions = df.get_meta("transformations.actions")+                    _rename = _actions.get("rename")++                    def get_name(_col_name):+                        c = _rename.get(_col_name)+                        # The column has not been rename. Get the actual column name+                        if c is None:+                            c = _col_name+                        return c++                    if _rename:+                        # if a list+                        if is_list_of_str(_col_names):+                            for _col_name in _col_names:+                                # The column name has been changed. Get the new name+                                _renamed_columns.append(get_name(_col_name))+                        # if a dict+                        if is_dict(_col_names):+                            for _col1, _col2 in _col_names.items():+                                _renamed_columns.append({get_name(_col1): get_name(_col2)})++                    else:+                        _renamed_columns = _col_names+                    return _renamed_columns++                # New columns+                new_columns = []++                current_col_names = df.cols.names()+                renamed_cols = match_renames(df.get_meta("transformations.columns"))+                for current_col_name in current_col_names:+                    if current_col_name not in renamed_cols:+                        new_columns.append(current_col_name)++                # Rename keys to match new names+                profiler_columns = self.output_columns["columns"]+                actions = df.get_meta("transformations.actions")+                rename = actions.get("rename")+                if rename:+                    for k, v in actions["rename"].items():+                        profiler_columns[v] = profiler_columns.pop(k)+                        profiler_columns[v]["name"] = v++                # Drop Keys+                for col_names in match_actions_names(drop):+                    profiler_columns.pop(col_names)++                # Copy Keys+                copy_columns = df.get_meta("transformations.actions.copy")+                if copy_columns is not None:+                    for source, target in copy_columns.items():+                        profiler_columns[target] = profiler_columns[source].copy()+                        profiler_columns[target]["name"] = target+                    # Check is a new column is a copied column+                    new_columns = list(set(new_columns) - set(copy_columns.values()))++                # Actions applied to current columns++                modified_columns = match_actions_names(Actions.list())+                calculate_columns = modified_columns + new_columns++                # Remove duplicated.+                calculate_columns = list(set(calculate_columns))++            elif not are_actions:+                calculate_columns = None+            # elif not is_cached:+        else:+            calculate_columns = columns++        return calculate_columns++    def is_cached(self):+        """++        :return:+        """+        return len(self.output_columns) > 0      def dataset(self, df, columns="*", buckets=10, infer=False, relative_error=RELATIVE_ERROR, approx_count=True,-                sample=10000, stats=True, format="dict", mismatch=None):+                sample=10000, stats=True, format="dict", mismatch=None, advanced_stats=False):
argenisleon

comment created time in 3 days

Pull request review commentironmussa/Optimus

Fix profiler frequency info

 import collections+import collections
argenisleon

comment created time in 3 days

pull request commentironmussa/Optimus

Fix profiler frequency info

Codacy Here is an overview of what got changed by this pull request:


Issues
======
+ Solved 3
- Added 2
           

Complexity increasing per file
==============================
- optimus/ml/distancecluster.py  2
         

Complexity decreasing per file
==============================
+ optimus/profiler/profiler.py  -7
         

See the complete overview on Codacy

argenisleon

comment created time in 3 days

Pull request review commentironmussa/Optimus

Fix profiler frequency info

 def to_file(self, path=None, output="html"):      def to_json(self, df, columns="*", buckets=10, infer=False, relative_error=RELATIVE_ERROR, approx_count=True,                 sample=10000, stats=True, mismatch=None):-        columns, output = self.dataset(df, columns=columns, buckets=buckets, infer=infer, relative_error=relative_error,-                                       approx_count=approx_count,-                                       sample=sample, stats=stats, format="json", mismatch=mismatch)-        return output+        return self.dataset(df, columns=columns, buckets=buckets, infer=infer, relative_error=relative_error,+                            approx_count=approx_count,+                            sample=sample, stats=stats, format="json", mismatch=mismatch)++    def cols_needs_profiling(self, df, columns):+        """+        Calculate the columns that needs to be profiled.+        :return:+        """+        # Metadata+        # If not empty the profiler already run.+        # So process the dataframe's metadata to be sure which columns need to be profiled++        actions = df.get_meta("transformations.actions")+        are_actions = actions is not None and len(actions) > 0++        # Process actions to check if any column must be processed+        if self.is_cached():+            if are_actions:++                drop = ["drop"]++                def match_actions_names(_actions):+                    """+                    Get a list of columns which have been applied and specific action.+                    :param _actions:+                    :return:+                    """++                    _actions_json = df.get_meta("transformations.actions")++                    modified = []+                    for action in _actions:+                        if _actions_json.get(action):+                            # Check if was renamed+                            col = _actions_json.get(action)+                            if len(match_renames(col)) == 0:+                                _result = col+                            else:+                                _result = match_renames(col)+                            modified = modified + _result++                    return modified++                def match_renames(_col_names):+                    """+                    Get a list fo columns and return the renamed version.+                    :param _col_names:+                    :return:+                    """+                    _renamed_columns = []+                    _actions = df.get_meta("transformations.actions")+                    _rename = _actions.get("rename")++                    def get_name(_col_name):+                        c = _rename.get(_col_name)+                        # The column has not been rename. Get the actual column name+                        if c is None:+                            c = _col_name+                        return c++                    if _rename:+                        # if a list+                        if is_list_of_str(_col_names):+                            for _col_name in _col_names:+                                # The column name has been changed. Get the new name+                                _renamed_columns.append(get_name(_col_name))+                        # if a dict+                        if is_dict(_col_names):+                            for _col1, _col2 in _col_names.items():+                                _renamed_columns.append({get_name(_col1): get_name(_col2)})++                    else:+                        _renamed_columns = _col_names+                    return _renamed_columns++                # New columns+                new_columns = []++                current_col_names = df.cols.names()+                renamed_cols = match_renames(df.get_meta("transformations.columns"))+                for current_col_name in current_col_names:+                    if current_col_name not in renamed_cols:+                        new_columns.append(current_col_name)++                # Rename keys to match new names+                profiler_columns = self.output_columns["columns"]+                actions = df.get_meta("transformations.actions")+                rename = actions.get("rename")+                if rename:+                    for k, v in actions["rename"].items():+                        profiler_columns[v] = profiler_columns.pop(k)+                        profiler_columns[v]["name"] = v++                # Drop Keys+                for col_names in match_actions_names(drop):+                    profiler_columns.pop(col_names)++                # Copy Keys+                copy_columns = df.get_meta("transformations.actions.copy")+                if copy_columns is not None:+                    for source, target in copy_columns.items():+                        profiler_columns[target] = profiler_columns[source].copy()+                        profiler_columns[target]["name"] = target+                    # Check is a new column is a copied column+                    new_columns = list(set(new_columns) - set(copy_columns.values()))++                # Actions applied to current columns++                modified_columns = match_actions_names(Actions.list())+                calculate_columns = modified_columns + new_columns++                # Remove duplicated.+                calculate_columns = list(set(calculate_columns))++            elif not are_actions:+                calculate_columns = None+            # elif not is_cached:+        else:+            calculate_columns = columns++        return calculate_columns++    def is_cached(self):+        """++        :return:+        """+        return len(self.output_columns) > 0      def dataset(self, df, columns="*", buckets=10, infer=False, relative_error=RELATIVE_ERROR, approx_count=True,-                sample=10000, stats=True, format="dict", mismatch=None):+                sample=10000, stats=True, format="dict", mismatch=None, advanced_stats=False):
argenisleon

comment created time in 3 days

Pull request review commentironmussa/Optimus

Fix profiler frequency info

 import collections+import collections
argenisleon

comment created time in 3 days

pull request commentironmussa/Optimus

Fix profiler frequency info

Codacy Here is an overview of what got changed by this pull request:


Issues
======
+ Solved 3
- Added 2
           

Complexity increasing per file
==============================
- optimus/ml/distancecluster.py  2
         

Complexity decreasing per file
==============================
+ optimus/profiler/profiler.py  -7
         

See the complete overview on Codacy

argenisleon

comment created time in 3 days

Pull request review commentironmussa/Optimus

Fix profiler frequency info

 import collections+import collections
argenisleon

comment created time in 3 days

Pull request review commentironmussa/Optimus

Fix profiler frequency info

 def to_file(self, path=None, output="html"):      def to_json(self, df, columns="*", buckets=10, infer=False, relative_error=RELATIVE_ERROR, approx_count=True,                 sample=10000, stats=True, mismatch=None):-        columns, output = self.dataset(df, columns=columns, buckets=buckets, infer=infer, relative_error=relative_error,-                                       approx_count=approx_count,-                                       sample=sample, stats=stats, format="json", mismatch=mismatch)-        return output+        return self.dataset(df, columns=columns, buckets=buckets, infer=infer, relative_error=relative_error,+                            approx_count=approx_count,+                            sample=sample, stats=stats, format="json", mismatch=mismatch)++    def cols_needs_profiling(self, df, columns):+        """+        Calculate the columns that needs to be profiled.+        :return:+        """+        # Metadata+        # If not empty the profiler already run.+        # So process the dataframe's metadata to be sure which columns need to be profiled++        actions = df.get_meta("transformations.actions")+        are_actions = actions is not None and len(actions) > 0++        # Process actions to check if any column must be processed+        if self.is_cached():+            if are_actions:++                drop = ["drop"]++                def match_actions_names(_actions):+                    """+                    Get a list of columns which have been applied and specific action.+                    :param _actions:+                    :return:+                    """++                    _actions_json = df.get_meta("transformations.actions")++                    modified = []+                    for action in _actions:+                        if _actions_json.get(action):+                            # Check if was renamed+                            col = _actions_json.get(action)+                            if len(match_renames(col)) == 0:+                                _result = col+                            else:+                                _result = match_renames(col)+                            modified = modified + _result++                    return modified++                def match_renames(_col_names):+                    """+                    Get a list fo columns and return the renamed version.+                    :param _col_names:+                    :return:+                    """+                    _renamed_columns = []+                    _actions = df.get_meta("transformations.actions")+                    _rename = _actions.get("rename")++                    def get_name(_col_name):+                        c = _rename.get(_col_name)+                        # The column has not been rename. Get the actual column name+                        if c is None:+                            c = _col_name+                        return c++                    if _rename:+                        # if a list+                        if is_list_of_str(_col_names):+                            for _col_name in _col_names:+                                # The column name has been changed. Get the new name+                                _renamed_columns.append(get_name(_col_name))+                        # if a dict+                        if is_dict(_col_names):+                            for _col1, _col2 in _col_names.items():+                                _renamed_columns.append({get_name(_col1): get_name(_col2)})++                    else:+                        _renamed_columns = _col_names+                    return _renamed_columns++                # New columns+                new_columns = []++                current_col_names = df.cols.names()+                renamed_cols = match_renames(df.get_meta("transformations.columns"))+                for current_col_name in current_col_names:+                    if current_col_name not in renamed_cols:+                        new_columns.append(current_col_name)++                # Rename keys to match new names+                profiler_columns = self.output_columns["columns"]+                actions = df.get_meta("transformations.actions")+                rename = actions.get("rename")+                if rename:+                    for k, v in actions["rename"].items():+                        profiler_columns[v] = profiler_columns.pop(k)+                        profiler_columns[v]["name"] = v++                # Drop Keys+                for col_names in match_actions_names(drop):+                    profiler_columns.pop(col_names)++                # Copy Keys+                copy_columns = df.get_meta("transformations.actions.copy")+                if copy_columns is not None:+                    for source, target in copy_columns.items():+                        profiler_columns[target] = profiler_columns[source].copy()+                        profiler_columns[target]["name"] = target+                    # Check is a new column is a copied column+                    new_columns = list(set(new_columns) - set(copy_columns.values()))++                # Actions applied to current columns++                modified_columns = match_actions_names(Actions.list())+                calculate_columns = modified_columns + new_columns++                # Remove duplicated.+                calculate_columns = list(set(calculate_columns))++            elif not are_actions:+                calculate_columns = None+            # elif not is_cached:+        else:+            calculate_columns = columns++        return calculate_columns++    def is_cached(self):+        """++        :return:+        """+        return len(self.output_columns) > 0      def dataset(self, df, columns="*", buckets=10, infer=False, relative_error=RELATIVE_ERROR, approx_count=True,-                sample=10000, stats=True, format="dict", mismatch=None):+                sample=10000, stats=True, format="dict", mismatch=None, advanced_stats=False):
argenisleon

comment created time in 3 days

pull request commentironmussa/Optimus

Fix profiler frequency info

Codacy Here is an overview of what got changed by this pull request:


Issues
======
+ Solved 3
- Added 2
           

Complexity increasing per file
==============================
- optimus/ml/distancecluster.py  2
         

Complexity decreasing per file
==============================
+ optimus/profiler/profiler.py  -7
         

See the complete overview on Codacy

argenisleon

comment created time in 3 days

PublicEvent

Pull request review commentironmussa/Optimus

Fix profiler frequency info

 MAX_BUCKETS = 33 BATCH_SIZE = 20 +import collections+import six++# python 3.8+ compatibility+try:+    collectionsAbc = collections.abc+except:
argenisleon

comment created time in 3 days

Pull request review commentironmussa/Optimus

Fix profiler frequency info

 def to_file(self, path=None, output="html"):      def to_json(self, df, columns="*", buckets=10, infer=False, relative_error=RELATIVE_ERROR, approx_count=True,                 sample=10000, stats=True, mismatch=None):-        columns, output = self.dataset(df, columns=columns, buckets=buckets, infer=infer, relative_error=relative_error,-                                       approx_count=approx_count,-                                       sample=sample, stats=stats, format="json", mismatch=mismatch)-        return output+        return self.dataset(df, columns=columns, buckets=buckets, infer=infer, relative_error=relative_error,+                            approx_count=approx_count,+                            sample=sample, stats=stats, format="json", mismatch=mismatch)++    def cols_needs_profiling(self, df, columns):+        """+        Calculate the columns that needs to be profiled.+        :return:+        """+        # Metadata+        # If not empty the profiler already run.+        # So process the dataframe's metadata to be sure which columns need to be profiled++        actions = df.get_meta("transformations.actions")+        are_actions = actions is not None and len(actions) > 0++        # Process actions to check if any column must be processed+        if self.is_cached():+            if are_actions:++                drop = ["drop"]++                def match_actions_names(_actions):+                    """+                    Get a list of columns which have been applied and specific action.+                    :param _actions:+                    :return:+                    """++                    _actions_json = df.get_meta("transformations.actions")++                    modified = []+                    for action in _actions:+                        if _actions_json.get(action):+                            # Check if was renamed+                            col = _actions_json.get(action)+                            if len(match_renames(col)) == 0:+                                _result = col+                            else:+                                _result = match_renames(col)+                            modified = modified + _result++                    return modified++                def match_renames(_col_names):+                    """+                    Get a list fo columns and return the renamed version.+                    :param _col_names:+                    :return:+                    """+                    _renamed_columns = []+                    _actions = df.get_meta("transformations.actions")+                    _rename = _actions.get("rename")++                    def get_name(_col_name):+                        c = _rename.get(_col_name)+                        # The column has not been rename. Get the actual column name+                        if c is None:+                            c = _col_name+                        return c++                    if _rename:+                        # if a list+                        if is_list_of_str(_col_names):+                            for _col_name in _col_names:+                                # The column name has been changed. Get the new name+                                _renamed_columns.append(get_name(_col_name))+                        # if a dict+                        if is_dict(_col_names):+                            for _col1, _col2 in _col_names.items():+                                _renamed_columns.append({get_name(_col1): get_name(_col2)})++                    else:+                        _renamed_columns = _col_names+                    return _renamed_columns++                # New columns+                new_columns = []++                current_col_names = df.cols.names()+                renamed_cols = match_renames(df.get_meta("transformations.columns"))+                for current_col_name in current_col_names:+                    if current_col_name not in renamed_cols:+                        new_columns.append(current_col_name)++                # Rename keys to match new names+                profiler_columns = self.output_columns["columns"]+                actions = df.get_meta("transformations.actions")+                rename = actions.get("rename")+                if rename:+                    for k, v in actions["rename"].items():+                        profiler_columns[v] = profiler_columns.pop(k)+                        profiler_columns[v]["name"] = v++                # Drop Keys+                for col_names in match_actions_names(drop):+                    profiler_columns.pop(col_names)++                # Copy Keys+                copy_columns = df.get_meta("transformations.actions.copy")+                if copy_columns is not None:+                    for source, target in copy_columns.items():+                        profiler_columns[target] = profiler_columns[source].copy()+                        profiler_columns[target]["name"] = target+                    # Check is a new column is a copied column+                    new_columns = list(set(new_columns) - set(copy_columns.values()))++                # Actions applied to current columns++                modified_columns = match_actions_names(Actions.list())+                calculate_columns = modified_columns + new_columns++                # Remove duplicated.+                calculate_columns = list(set(calculate_columns))++            elif not are_actions:+                calculate_columns = None+            # elif not is_cached:+        else:+            calculate_columns = columns++        return calculate_columns++    def is_cached(self):+        """++        :return:+        """+        return len(self.output_columns) > 0      def dataset(self, df, columns="*", buckets=10, infer=False, relative_error=RELATIVE_ERROR, approx_count=True,-                sample=10000, stats=True, format="dict", mismatch=None):+                sample=10000, stats=True, format="dict", mismatch=None, advanced_stats=False):
argenisleon

comment created time in 3 days

pull request commentironmussa/Optimus

Fix profiler frequency info

Codacy Here is an overview of what got changed by this pull request:


Issues
======
+ Solved 3
- Added 2
           

Complexity decreasing per file
==============================
+ optimus/profiler/profiler.py  -7
         

See the complete overview on Codacy

argenisleon

comment created time in 3 days

delete branch ironmussa/Optimus

delete branch : pyup-update-deprecated-1.2.5-to-1.2.6

delete time in 4 days

push eventironmussa/Optimus

pyup-bot

commit sha 3b4747c2dd36cddfaf268ece016aa8f50e8ff85a

Update deprecated from 1.2.5 to 1.2.7

view details

push time in 4 days

push eventironmussa/Optimus

pyup-bot

commit sha 0cc28dce1ced2045f49c2f6239fa7b967ec2b371

Update deprecated from 1.2.5 to 1.2.7

view details

push time in 4 days

push eventironmussa/Optimus

pyup-bot

commit sha ae22a5b1f9434eeffeea2687751584aed85b758e

Update deprecated from 1.2.5 to 1.2.7

view details

push time in 4 days

push eventironmussa/Optimus

pyup-bot

commit sha 588e5babd19f0989e13be41632c1e063a72bea11

Update deprecated from 1.2.5 to 1.2.7

view details

push time in 4 days

create barnchironmussa/Optimus

branch : pyup-update-deprecated-1.2.5-to-1.2.7

created branch time in 4 days

push eventironmussa/Optimus

pyup-bot

commit sha 7e07063e61c1e9043e75c1c6a5ee54102b7f34c3

Update numpy from 1.17.2 to 1.17.4

view details

push time in 5 days

push eventironmussa/Optimus

pyup-bot

commit sha f7ea2593f7824c364701c96f60a27bc304e0a787

Update numpy from 1.17.2 to 1.17.4

view details

push time in 5 days

push eventironmussa/Optimus

pyup-bot

commit sha 8cb8ebed95819d0b71e9a743df5925b8427d8fe1

Update numpy from 1.17.2 to 1.17.4

view details

push time in 5 days

push eventironmussa/Optimus

pyup-bot

commit sha 84916fab47754a9c94787f056a43edaae5be13ed

Update numpy from 1.11.1 to 1.17.4

view details

push time in 5 days

create barnchironmussa/Optimus

branch : pyup-update-numpy-1.11.1-to-1.17.4

created branch time in 5 days

delete branch ironmussa/Optimus

delete branch : pyup-update-kombu-4.6.1-to-4.6.5

delete time in 6 days

push eventironmussa/Optimus

pyup-bot

commit sha 40b1d80e308a4e35895d32f8930e6ee7d383a5ce

Update kombu from 4.6.1 to 4.6.6

view details

push time in 6 days

push eventironmussa/Optimus

pyup-bot

commit sha d6c1236ae14bb76da58ae236438ed04d765e606f

Update kombu from 4.6.1 to 4.6.6

view details

push time in 6 days

push eventironmussa/Optimus

pyup-bot

commit sha 55efad5482f09d1fe7b720544a4ee711e5061652

Update kombu from 4.6.1 to 4.6.6

view details

push time in 6 days

push eventironmussa/Optimus

pyup-bot

commit sha 72b69796c9f6d53cbf5d48748bdbef85d02abf65

Update kombu from 4.6.1 to 4.6.6

view details

push time in 6 days

create barnchironmussa/Optimus

branch : pyup-update-kombu-4.6.1-to-4.6.6

created branch time in 6 days

delete branch ironmussa/Optimus

delete branch : pyup-update-tqdm-4.28.1-to-4.37.0

delete time in 7 days

push eventironmussa/Optimus

pyup-bot

commit sha 27aab8f57cbb3d3e506113bd554d783fde17e7cb

Update tqdm from 4.28.1 to 4.38.0

view details

push time in 7 days

push eventironmussa/Optimus

pyup-bot

commit sha 3022e7482d13d8fa911adbceb591a505edfa61e9

Update tqdm from 4.28.1 to 4.38.0

view details

push time in 7 days

push eventironmussa/Optimus

pyup-bot

commit sha 6f7989829701f7dfc2a57d8d9302e78a93240c7e

Update tqdm from 4.28.1 to 4.38.0

view details

push time in 7 days

push eventironmussa/Optimus

pyup-bot

commit sha fddc70a62cc61765b7d94efef285ca34a79153c3

Update tqdm from 4.28.1 to 4.38.0

view details

push time in 7 days

create barnchironmussa/Optimus

branch : pyup-update-tqdm-4.28.1-to-4.38.0

created branch time in 7 days

delete branch ironmussa/Optimus

delete branch : pyup-update-deepdiff-4.0.6-to-4.0.8

delete time in 7 days

push eventironmussa/Optimus

pyup-bot

commit sha 1839c224b955ed727a6fbea75805a102319a402b

Update deepdiff from 4.0.6 to 4.0.9

view details

push time in 7 days

push eventironmussa/Optimus

pyup-bot

commit sha 56907023696f13e595b0882a1018f67d9e796064

Update deepdiff from 4.0.6 to 4.0.9

view details

push time in 7 days

push eventironmussa/Optimus

pyup-bot

commit sha b9de9ce80a5fa388cf74c2ee3e7359252d95c809

Update deepdiff from 4.0.6 to 4.0.9

view details

push time in 7 days

push eventironmussa/Optimus

pyup-bot

commit sha c8b8fa397716a73cd32e170435cd241762dbd8a4

Update deepdiff from 4.0.6 to 4.0.9

view details

push time in 7 days

create barnchironmussa/Optimus

branch : pyup-update-deepdiff-4.0.6-to-4.0.9

created branch time in 7 days

pull request commentironmussa/Optimus

Update h2o-pysparkling-2.4 to 2.4.13

It passed the ML tests right @argenisleon ?

atwoodjw

comment created time in 8 days

startedsvjan5/GNNs-for-NLP

started time in 10 days

starteddeezer/spleeter

started time in 10 days

push eventironmussa/Optimus

pyup-bot

commit sha 149672b0564077746bc2191cc152362cf5b36b7a

Update psutil from 5.6.3 to 5.6.5

view details

push time in 10 days

push eventironmussa/Optimus

pyup-bot

commit sha b27f640ff4da67292b9435b8352de8cf4a298e04

Update psutil from 5.6.3 to 5.6.5

view details

push time in 10 days

create barnchironmussa/Optimus

branch : pyup-update-psutil-5.6.3-to-5.6.5

created branch time in 10 days

push eventironmussa/Optimus

pyup-bot

commit sha bad9ffe43b0edcc6331da6f7f036cc549754ad66

Update psutil from 5.4.8 to 5.6.5

view details

push time in 10 days

push eventironmussa/Optimus

pyup-bot

commit sha 57abd60257ceb5acf98c1c8f7d2f947278475aa6

Update psutil from 5.6.3 to 5.6.5

view details

push time in 10 days

push eventironmussa/Optimus

pyup-bot

commit sha 855bc00ba535fa075b67597a09940bc8ac625e92

Update mysqlclient from 1.4.4 to 1.4.5

view details

push time in 10 days

create barnchironmussa/Optimus

branch : pyup-update-mysqlclient-1.4.4-to-1.4.5

created branch time in 10 days

created tagironmussa/Optimus

tag2.2.25

:truck: Agile Data Science Workflows made easy with Pyspark

created time in 12 days

release ironmussa/Optimus

2.2.25

released time in 12 days

push eventironmussa/Optimus

pyup-bot

commit sha a871b945ba8ce06889d4589d13ee3b77f7f5e236

Update python_dateutil from 2.7.5 to 2.8.1

view details

push time in 13 days

create barnchironmussa/Optimus

branch : pyup-update-python_dateutil-2.7.5-to-2.8.1

created branch time in 13 days

push eventironmussa/Optimus

pyup-bot

commit sha 953edf177ad6b384b5cce6e4062cf59da16e9c96

Update psutil from 5.6.3 to 5.6.4

view details

push time in 13 days

push eventironmussa/Optimus

pyup-bot

commit sha 2e9f06ee8f123d4a2c346198daf6f344dc017490

Update psutil from 5.6.3 to 5.6.4

view details

push time in 13 days

push eventironmussa/Optimus

pyup-bot

commit sha eaff818088c480a763c07831414c1ac7a3be734d

Update psutil from 5.4.8 to 5.6.4

view details

push time in 13 days

push eventironmussa/Optimus

pyup-bot

commit sha 8dc0ab63d9973187ef03739faa16b130ecc41eda

Update psutil from 5.6.3 to 5.6.4

view details

push time in 13 days

create barnchironmussa/Optimus

branch : pyup-update-psutil-5.6.3-to-5.6.4

created branch time in 13 days

push eventironmussa/Optimus

pyup-bot

commit sha f93e222faf0c4c8200ff5c67454747c88c90c190

Update pandas from 0.24.2 to 0.25.3

view details

push time in 15 days

push eventironmussa/Optimus

pyup-bot

commit sha 8c68a73281bb2189648afc0522438bbc3d805518

Update pandas from 0.24.2 to 0.25.3

view details

push time in 15 days

push eventironmussa/Optimus

pyup-bot

commit sha 93baaff9b1dd43fb9e371637041efa0695882bfa

Update pandas from 0.19.2 to 0.25.3

view details

push time in 15 days

create barnchironmussa/Optimus

branch : pyup-update-pandas-0.19.2-to-0.25.3

created branch time in 15 days

push eventironmussa/Optimus

pyup-bot

commit sha a8a757a4eb40f9ed1a39679a921a8626c974fe00

Update pyarrow from 0.13.0 to 0.15.1

view details

push time in 15 days

push eventironmussa/Optimus

pyup-bot

commit sha b5ca8748e19a37fd9aa20afaa86a5fce388d14a2

Update pyarrow from 0.13.0 to 0.15.1

view details

push time in 15 days

push eventironmussa/Optimus

pyup-bot

commit sha afbbe8a68155656cfb34fbdc0cc3882b2f1e3b46

Update pyarrow from 0.13.0 to 0.15.1

view details

push time in 15 days

create barnchironmussa/Optimus

branch : pyup-update-pyarrow-0.13.0-to-0.15.1

created branch time in 15 days

delete branch ironmussa/Optimus

delete branch : pyup-update-tqdm-4.28.1-to-4.36.1

delete time in 16 days

push eventironmussa/Optimus

pyup-bot

commit sha 9415a5a40f4b2ebfd96d4045972029a5b485a36d

Update tqdm from 4.28.1 to 4.37.0

view details

push time in 16 days

push eventironmussa/Optimus

pyup-bot

commit sha ec5c970b9270d98a26bb5abd9bf6545c3f565d50

Update tqdm from 4.28.1 to 4.37.0

view details

push time in 16 days

push eventironmussa/Optimus

pyup-bot

commit sha ecff19c9fa4738df745b556fe5928dc306bdff9c

Update tqdm from 4.28.1 to 4.37.0

view details

push time in 16 days

push eventironmussa/Optimus

pyup-bot

commit sha 66d80de93fac8574fbc174795cecda6b468da4d6

Update tqdm from 4.28.1 to 4.37.0

view details

push time in 16 days

create barnchironmussa/Optimus

branch : pyup-update-tqdm-4.28.1-to-4.37.0

created branch time in 16 days

delete branch ironmussa/Optimus

delete branch : pyup-update-pypika-0.32.0-to-0.35.14

delete time in 16 days

push eventironmussa/Optimus

pyup-bot

commit sha dbb3a7e311b0159d516932f8db94e2b5013912b4

Update pypika from 0.32.0 to 0.35.15

view details

push time in 16 days

push eventironmussa/Optimus

pyup-bot

commit sha 7c478f9092dae603b913021d373706f747c17591

Update pypika from 0.32.0 to 0.35.15

view details

push time in 16 days

push eventironmussa/Optimus

pyup-bot

commit sha c0a91583d6755b58222f2ab64e625c74041146a5

Update pypika from 0.32.0 to 0.35.15

view details

push time in 16 days

create barnchironmussa/Optimus

branch : pyup-update-pypika-0.32.0-to-0.35.15

created branch time in 16 days

push eventironmussa/Optimus

pyup-bot

commit sha 5a20a9f8ebe96e33adaae9ebbb675e62fe7bc6cb

Update pypika from 0.32.0 to 0.35.15

view details

push time in 16 days

startedfacebookresearch/SlowFast

started time in 17 days

pull request commentFavioVazquez/ds-cheatsheets

R cheatsheet for plotting

Hi @abbiesachapman! We need to put it in the folder as a PDF, also as a PNG. And then link it to the Readme. Do you know how to do this? How can I help?

abbiesachapman

comment created time in 17 days

push eventFavioVazquez/ds-cheatsheets

Mexson Fernandes

commit sha 1a65e98b6fe6c92d5c4231c68de681a0cf81ee58

(added) links and content in table for navigation

view details

Mexson Fernandes

commit sha 4ed6032b408c88e69ce4684a0141d9d0d48e45fa

fix(link): whitespaces + link corrected

view details

Favio André Vázquez

commit sha 1ef986764cbd9d677fb56e9d6c0d1989bbaab108

Merge pull request #27 from MexsonFernandes/master Added content in table for better navigation. Closes #24

view details

push time in 17 days

PR merged FavioVazquez/ds-cheatsheets

Added content in table for better navigation enhancement

All PDF highlights are added in table of content for better navigation and to locate the required cheatsheet easily.

+79 -1

2 comments

1 changed file

MexsonFernandes

pr closed time in 17 days

more