This results in a narrow dependency, e.g. if you go from 1000 partitions to 100 partitions, there will not be a shuffle, instead each of the 100 new partitions will claim 10 of the current partitions.
The repartition() is the expensive method.
Regarding Pedro's problem for sure RDD.reduceByKey(func, number).saveAsTextFile() is expected to be better but the hours vs 2 minutes sounds really bad. What is the number of partitions you are going from and what is the target number of partitions (the number in your example)?
Probably you should compare the stages tab and stage details on the UI. So if you need the community help please share the event logs of the two runs and the applications logs might be needed too (the event log and applications log must be from the same run for both cases).