We use cookies to ensure you have the best browsing experience on our website. Please read our cookie policy for more information about how we use cookies.
  • Hackerrank Home
  • Prepare
    NEW
  • Certify
  • Compete
  • Career Fair
  • Hiring developers?
  1. Prepare
  2. Artificial Intelligence
  3. Statistics and Machine Learning
  4. Predict the Missing Grade
  5. Discussions

Predict the Missing Grade

Problem
Submissions
Leaderboard
Discussions
Editorial

Sort 23 Discussions, By:

votes

Please Login in order to post a comment

  • aquaShade
    3 years ago+ 1 comment

    Parsing the .json file can be tricky, but json is available in Python for exactly this. This is one way that the module can be used:

    import pandas as pd 
    import json
    
    num_lines = int(input())
    inputs = [json.loads(input()) for _ in range(num_lines)]
    test_df = pd.DataFrame(inputs).fillna(0)
    with open('training.json') as f:
        data = [json.loads(line) for line in f]
        data.remove(data[0])
        train_df = pd.DataFrame(data).fillna(0)
    

    Other minor points to note when submitting:

    • test and train may not have identical columns, and therefore need to be aligned
    • the marker expects integers, so any floats need to be rounded and converted
    5|
    Permalink
  • sharnam19_nc
    6 years ago+ 0 comments

    Getting the data from .json file is taking way too much time, are there any modules that reads list of json objects in python?

    4|
    Permalink
  • RyanTPT
    10 months ago+ 0 comments

    Here's my solution. Not that clean but gets the job done:

    # Enter your code here. Read input from STDIN. Print output to STDOUT
    import json
    import os
    import sys
    import numpy as np
    from sklearn.linear_model import LinearRegression
    
    def read_input():
        X_test = []
        for line in sys.stdin.readlines()[1:]:
            data = json.loads(line.rstrip("\n"))
            del data["serial"]
            X_test.append(list(data.values()))
        return np.array(X_test)
    
    def read_training_data():
        X_train = []
        y_train = []
        with open("training.json") as f:
            np_string = np.array(f.read().split("\n"))[1:-1]
            for json_obj in np_string:
                data = json.loads(json_obj)
                y_train.append(data["Mathematics"])
                del data["Mathematics"]
                del data["serial"]
                X_train.append(list(data.values()))
            f.close()
        return np.array(X_train), np.array(y_train)
    
    def train():
        X_train, y_train = read_training_data()
        X_test = read_input()
        lin_reg = LinearRegression()
        lin_reg.fit(X_train, y_train)
        y_test = lin_reg.predict(X_test).astype(np.uint8)
        return y_test
    
    def show(y):
        y_string = [str(y[k]) for k in range(len(y))]
        print("\n".join(y_string))
    
    if __name__ == "__main__":
        show(train())
    
    1|
    Permalink
  • solgas80
    1 year ago+ 0 comments

    My solution in Python deploys LDA to predict Math score. I was hoping to code in R but reading JSON files in R I get timed out.

    import pandas as pd 
    import json
    import numpy as np
    from sklearn.discriminant_analysis import LinearDiscriminantAnalysis as LDA
    
    num_lines = int(input())
    inputs = [json.loads(input()) for _ in range(num_lines)]
    test_df = pd.DataFrame(inputs).fillna(0)
    with open('training.json') as f:
        data = [json.loads(line) for line in f]
        data.remove(data[0])
        train_df = pd.DataFrame(data).fillna(0)
    
    test_df = test_df.drop(columns=['serial'])
    train_df = train_df.drop(columns=['serial'])
    missing_cols = list(train_df.columns.difference(test_df.columns))
    
    if len(missing_cols) > 1:
        zeros = [0 for i in range(len(test_df))]
        for c in missing_cols:
            train_df[c] = zeros
    
    X_train = train_df[['Physics','Chemistry','PhysicalEducation','English','Biology','Accountancy','BusinessStudies','Economics','ComputerScience']]
    Y_train = train_df['Mathematics']
    
    model = LDA()
    X_train = model.fit(X_train.to_numpy(), Y_train.values.ravel())
    X_test = test_df.to_numpy()
    
    out = model.predict(X_test)
    for i in out:
        print(i)
    
    1|
    Permalink
  • 18083891_tuan
    4 weeks ago+ 0 comments
    import json
    import os
    import sys
    import numpy as np
    from sklearn.linear_model import LinearRegression
    
    def loadTestingData():
        x = []
        n = int(input())
        
        for i in range(n):
            item = json.loads(input())
            del item['serial']
            x.append(list(item.values()))
        return np.array(x)
    
    def loadTrainingData():
        x = []
        y = []
        with open('training.json') as f:
            lines = f.readlines()[0:5]
            lines.pop(0)
            for line in lines:
                item = json.loads(line)
                y.append([item['Mathematics']])
                
                del item['Mathematics']
                del item['serial']
                
                x.append(list(item.values()))
        return np.array(x), np.array(y)
        
    def main():
        xTrain, yTrain = loadTrainingData()
        xTest = loadTestingData()
        
        model = LinearRegression().fit(xTrain, yTrain)
        result = model.predict(xTest).flatten()
        
        for item in result:
            print(round(item))
        
    if __name__ == "__main__":
        main()
    
    0|
    Permalink
Load more conversations

Need Help?


View editorial
View top submissions
  • Blog
  • Scoring
  • Environment
  • FAQ
  • About Us
  • Support
  • Careers
  • Terms Of Service
  • Privacy Policy
  • Request a Feature