For function approximation, the loss function should be the MSE. The inputs should be normalized and, while the hidden layer can be ReLU, the output layer should preferably be sigmoid.