Find the solubility from the dataset:
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.ensemble import RandomForestRegressor
from sklearn.metrics import r2_score, mean_squared_error
# Load the dataset
dp = pd.read_csv('https://raw.githubusercontent.com/dataprofessor/data/master/delaney_solubility_with_descriptors.csv')
# Separate features (x) and target variable (y)
y = dp['logS']
x = dp.drop('logS', axis=1)
# Split the data
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.2, random_state=100)
# Linear Regression Model
lr = LinearRegression()
lr.fit(x_train, y_train)
y_train_pred_lr = lr.predict(x_train)
y_test_pred_lr = lr.predict(x_test)
# Random Forest Regressor Model
k1 = RandomForestRegressor(max_depth=2, random_state=100)
k1.fit(x_train, y_train)
y_train_pred_rf = k1.predict(x_train)
y_test_pred_rf = k1.predict(x_test)
# Evaluate Linear Regression Model
y_train_mse_lr = mean_squared_error(y_train, y_train_pred_lr)
y_train_r2_lr = r2_score(y_train, y_train_pred_lr)
y_test_mse_lr = mean_squared_error(y_test, y_test_pred_lr)
y_test_r2_lr = r2_score(y_test, y_test_pred_lr)
# Evaluate Random Forest Regressor Model
y_train_mse_rf = mean_squared_error(y_train, y_train_pred_rf)
y_train_r2_rf = r2_score(y_train, y_train_pred_rf)
y_test_mse_rf = mean_squared_error(y_test, y_test_pred_rf)
y_test_r2_rf = r2_score(y_test, y_test_pred_rf)
# Create DataFrames
rs_lr = pd.DataFrame({"Method": ["Linear Regression"],
"Training MSE": [y_train_mse_lr],
"Training R2": [y_train_r2_lr],
"Testing MSE": [y_test_mse_lr],
"Testing R2": [y_test_r2_lr]})
rs_rf = pd.DataFrame({"Method": ["Random Forest Regressor"],
"Training MSE": [y_train_mse_rf],
"Training R2": [y_train_r2_rf],
"Testing MSE": [y_test_mse_rf],
"Testing R2": [y_test_r2_rf]})
# Concatenate DataFrames
finale = pd.concat([rs_lr, rs_rf], ignore_index=True)
print(finale)
MolLogP, MolWt, NumRotatableBonds, and AromaticProportion those were the user can give. Then the program should predict the value of logS. I don't how to modify the program for the user input.