# Multiply Matrix to it's transpose get undesired output

 Hi,I want to compute the cosine similarities of vectors using apache spark. In a simple example, I created a vector from each document using built-in tf-idf. Here is the code:`hashingTF = HashingTF(inputCol="tokenized", outputCol="tf")tf = hashingTF.transform(df)idf = IDF(inputCol="tf", outputCol="feature").fit(tf)tfidf = idf.transform(tf)normalizer = Normalizer(inputCol="feature", outputCol="norm")data = normalizer.transform(tfidf)mat = IndexedRowMatrix( data.select("id", "norm")\ .rdd.map(lambda row: IndexedRow(row.id, row.norm.toArray()))).toBlockMatrix()dot = mat.multiply(mat.transpose())`In the output, I expect it generates a matrix with Matrix diagonal of 1 (because each vector's similarity to itself is one) and its Matrix diagonal is one, too (as desired).[[1. 0.] [0. 1.]]The problem is when I want to weight words in the vector space to something other than typical TF-IDF. So I compute the vector space and create a vector for each document that the index of document's words has new weights and other than has weights zero.for example the following vector is for document id 0.(0, [9.0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 3.3010299956639813, 3.3010299956639813, 0, 3.3010299956639813, 0, 3.3010299956639813])The problem is when I try to compute cosine similarity of the matrix it didn't produce the correct answer because the similarity of a document to itself is not 1:`mat = IndexedRowMatrix( final_vectors.map(lambda row: IndexedRow(row[0], row[1]))).toBlockMatrix()dot = mat.multiply(mat.transpose())`the output for the same dataset is :[[124.58719613  81.        ] [ 81.         407.90397097]]while with Spark TF-IDF approach it was :[[1. 0.] [0. 1.]]Where is wrong in my approach?